Method and product for localized or spatial detection of nucleic acid in a tissue sample

ABSTRACT

Localized detection of RNA in a tissue sample that includes cells is accomplished on an array. The array include a number of features on a substrate. Each feature includes a different capture probe immobilized such that the capture probe has a free 3′ end. Each feature occupies a distinct position on the array and has an area of less than about 1 mm 2 . Each capture probe is a nucleic acid molecule, which includes a positional domain including a nucleotide sequence unique to a particular feature, and a capture domain including a nucleotide sequence complementary to the RNA to be detected. The capture domain can be at a position 3′ of the positional domain.

PRIORITY AND CROSS REFERENCE TO RELATED APPLICATIONS

This application is a Continuation Application of U.S. application Ser.No. 14/111,482, filed Oct. 11, 2013 and published as US 2014/0066381 A1on Mar. 6, 2014, which is a U.S. National Phase Application under 35U.S.C. § 371 of International Application No. PCT/EP2012/056823, filedApr. 13, 2012, designating the U.S. and published in the Englishlanguage as WO 2012/140224 A1 on Oct. 18, 2012, which claims the benefitof U.K. Application No. 1106254.4, filed Apr. 13, 2011. Any and allapplications for which a foreign or a domestic priority is claimedis/are identified in the Application Data Sheet filed herewith andis/are hereby incorporated by reference in their entirety under 37C.F.R. § 1.57.

SEQUENCE LISTING IN ELECTRONIC FORMAT

The present application is being filed along with an Electronic SequenceListing as an ASCII text file via EFS-Web. The Electronic SequenceListing is provided as a file entitled DEHN53001C1SEQLIST.txt, createdand last saved on Jun. 18, 2018, which is 30,768 bytes in size. Theinformation in the Electronic Sequence Listing is incorporated herein byreference in its entirety.

The present invention relates generally to the localised or spatialdetection of nucleic acid in a tissue sample. The nucleic acid may beRNA or DNA. Thus, the present invention provides methods for detectingand/or analysing RNA, e.g. RNA transcripts or genomic DNA, so as toobtain spatial information about the localisation, distribution orexpression of genes, or indeed about the localisation or distribution ofany genomic variation (not necessarily in a gene) in a tissue sample,for example in an individual cell. The present invention thus enablesspatial genomics and spatial transcriptomics.

More particularly, the present invention relates to a method fordetermining and/or analysing a transcriptome or genome and especiallythe global transcriptome or genome, of a tissue sample. In particularthe method relates to a quantitative and/or qualitative method foranalysing the distribution, location or expression of genomic sequencesin a tissue sample wherein the spatial expression or distribution orlocation pattern within the tissue sample is retained. Thus, the newmethod provides a process for performing “spatial transcriptomics” or“spatial genomics”, which enables the user to determine simultaneouslythe expression pattern, or the location/distribution pattern of thegenes expressed or genes or genomic loci present in a tissue sample.

The invention is particularly based on array technology coupled withhigh throughput DNA sequencing technologies, which allows the nucleicacid molecule (e.g. RNA or DNA molecules) in the tissue sample,particularly mRNA or DNA, to be captured and labelled with a positionaltag. This step is followed by synthesis of DNA molecules which aresequenced and analysed to determine which genes are expressed in any andall parts of the tissue sample. Advantageously, the individual, separateand specific transcriptome of each cell in the tissue sample may beobtained at the same time. Hence, the methods of the invention may besaid to provide highly parallel comprehensive transcriptome signaturesfrom individual cells within a tissue sample without losing spatialinformation within said investigated tissue sample. The invention alsoprovides an array for performing the method of the invention and methodsfor making the arrays of the invention.

The human body comprises over 100 trillion cells and is organized intomore than 250 different organs and tissues. The development andorganization of complex organs, such as the brain, are far fromunderstood and there is a need to dissect the expression of genesexpressed in such tissues using quantitative methods to investigate anddetermine the genes that control the development and function of suchtissues. The organs are in themselves a mixture of differentiated cellsthat enable all bodily functions, such as nutrient transport, defenceetc. to be coordinated and maintained. Consequently, cell function isdependent on the position of the cell within a particular tissuestructure and the interactions it shares with other cells within thattissue, both directly and indirectly. Hence, there is a need todisentangle how these interactions influence each cell within a tissueat the transcriptional level.

Recent findings by deep RNA sequencing have demonstrated that a majorityof the transcripts can be detected in a human cell line and that a largefraction (75%) of the human protein-coding genes are expressed in mosttissues. Similarly, a detailed study of 1% of the human genome showedthat chromosomes are ubiquitously transcribed and that the majority ofall bases are included in primary transcripts. The transcriptionmachinery can therefore be described as promiscuous at a global level.

It is well-known that transcripts are merely a proxy for proteinabundance, because the rates of RNA translation, degradation etc willinfluence the amount of protein produced from any one transcript. Inthis respect, a recent antibody-based analysis of human organs andtissues suggests that tissue specificity is achieved by preciseregulation of protein levels in space and time, and that differenttissues in the body acquire their unique characteristics by controllingnot which proteins are expressed but how much of each is produced.

However, in subsequent global studies transcriptome and proteomecorrelations have been compared demonstrating that the majority of allgenes were shown to be expressed. Interestingly, there was shown to be ahigh correlation between changes in RNA and protein levels forindividual gene products which is indicative of the biologicalusefulness of studying the transcriptome in individual cells in thecontext of the functional role of proteins.

Indeed, analysis of the histology and expression pattern in tissues is acornerstone in biomedical research and diagnostics. Histology, utilizingdifferent staining techniques, first established the basic structuralorganization of healthy organs and the changes that take place in commonpathologies more than a century ago. Developments in this field resultedin the possibility of studying protein distribution byimmunohistochemistry and gene expression by in situ hybridization.

However, the parallel development of increasingly advanced histologicaland gene expression techniques has resulted in the separation of imagingand transcriptome analysis and, until the methods of the presentinvention, there has not been any feasible method available for globaltranscriptome analysis with spatial resolution.

As an alternative, or in addition, to in situ techniques, methods havedeveloped for the in vitro analysis of proteins and nucleic acids, i.e.by extracting molecules from whole tissue samples, single cell types, oreven single cells, and quantifying specific molecules in said extracts,e.g. by ELISA, qPCR etc.

Recent developments in the analysis of gene expression have resulted inthe possibility of assessing the complete transcriptome of tissues usingmicroarrays or RNA sequencing, and such developments have beeninstrumental in our understanding of biological processes and fordiagnostics. However, transcriptome analysis typically is performed onmRNA extracted from whole tissues (or even whole organisms), and methodsfor collecting smaller tissue areas or individual cells fortranscriptome analysis are typically labour intensive, costly and havelow precision.

Hence, the majority of gene expression studies based on microarrays ornext generation sequencing of RNA use a representative sample containingmany cells. Thus the results represent the average expression levels ofthe investigated genes. The separation of cells that are phenotypicallydifferent has been used in some cases together with the global geneexpression platforms (Tang F et al, Nat Protoc. 2010; 5: 516-35; Wang D& Bodovitz S, Trends Biotechnol. 2010; 28:281-90) and resulted in veryprecise information about cell-to-cell variations. However, highthroughput methods to study transcriptional activity with highresolution in intact tissues have not, until now, been available.

Thus, existing techniques for the analysis of gene expression patternsprovide spatial transcriptional information only for one or a handful ofgenes at a time or offer transcriptional information for all of thegenes in a sample at the cost of losing positional information. Hence,it is evident that methods to determine simultaneously, separately andspecifically the transcriptome of each cell in a sample are required,i.e. to enable global gene expression analysis in tissue samples thatyields transcriptomic information with spatial resolution, and thepresent invention addresses this need.

The novel approach of the methods and products of the present inventionutilizes now well established array and sequencing technology to yieldtranscriptional information for all of the genes in a sample, whilstretaining the positional information for each transcript. It will beevident to the person of skill in the art that this represents amilestone in the life sciences. The new technology opens a new field ofso-called “spatial transcriptomics”, which is likely to have profoundconsequences for our understanding of tissue development and tissue andcellular function in all multicellular organisms. It will be apparentthat such techniques will be particularly useful in our understanding ofthe cause and progress of disease states and in developing effectivetreatments for such diseases, e.g. cancer. The methods of the inventionwill also find uses in the diagnosis of numerous medical conditions.

Whilst initially conceived with the aim of transcriptome analysis inmind, as described in detail below, the principles and methods of thepresent invention may be applied also to the analysis of DNA and hencefor genomic analyses also (“spatial genomics”). Accordingly, at itsbroadest the invention pertains to the detection and/or analysis ofnucleic acid in general.

Array technology, particularly microarrays, arose from research atStanford University where small amounts of DNA oligonucleotides weresuccessfully attached to a glass surface in an ordered arrangement, aso-called “array”, and used it to monitor the transcription of 45 genes(Schena M et al, Science. 1995; 270: 368-9, 371).

Since then, researchers around the world have published more than 30,000papers using microarray technology. Multiple types of microarray havebeen developed for various applications, e.g. to detect singlenucleotide polymorphisms (SNPs) or to genotype or re-sequence mutantgenomes, and an important use of microarray technology has been for theinvestigation of gene expression. Indeed, the gene expression microarraywas created as a means to analyze the level of expressed geneticmaterial in a particular sample, with the real gain being thepossibility to compare expression levels of many genes simultaneously.Several commercial microarray platforms are available for these types ofexperiments but it has also been possible to create custom made geneexpression arrays.

Whilst the use of microarrays in gene expression studies is nowcommonplace, it is evident that new and more comprehensive so-called“next-generation DNA sequencing” (NGS) technologies are starting toreplace DNA microarrays for many applications, e.g. in-depthtranscriptome analysis.

The development of NGS technologies for ultra-fast genome sequencingrepresents a milestone in the life sciences (Petterson E et al,Genomics. 2009; 93: 105-11). These new technologies have dramaticallydecreased the cost of DNA sequencing and enabled the determination ofthe genome of higher organisms at an unprecedented rate, including thoseof specific individuals (Wade C M et al Science. 2009; 326: 865-7; RubinJ et al, Nature 2010; 464: 587-91). The new advances in high-throughputgenomics have reshaped the biological research landscape and in additionto complete characterization of genomes it is possible also to study thefull transcriptome in a digital and quantitative fashion. Thebioinformatics tools to visualize and integrate these comprehensive setsof data have also been significantly improved during recent years.

However, it has surprisingly been found that a unique combination ofhistological, microarray and NGS techniques can yield comprehensivetranscriptional or genomic information from multiple cells in a tissuesample which information is characterised by a two-dimensional spatialresolution. Thus, at one extreme the methods of the present inventioncan be used to analyse the expression of a single gene in a single cellin a sample, whilst retaining the cell within its context in the tissuesample. At the other extreme, and in a preferred aspect of theinvention, the methods can be used to determine the expression of everygene in each and every cell, or substantially all cells, in a samplesimultaneously, i.e. the global spatial expression pattern of a tissuesample. It will be apparent that the methods of the invention alsoenable intermediate analyses to be performed.

In its simplest form, the invention may be illustrated by the followingsummary. The invention requires reverse transcription (RT) primers,which comprise also unique positional tags (domains), to be arrayed onan object substrate, e.g. a glass slide, to generate an “array”. Theunique positional tags correspond to the location of the RT primers onthe array (the features of the array). Thin tissue sections are placedonto the array and a reverse transcription reaction is performed in thetissue section on the object slide. The RT primers, to which the RNA inthe tissue sample binds (or hybridizes), are extended using the boundRNA as a template to obtain cDNA, which is therefore bound to thesurface of the array. As consequence of the unique positional tags inthe RT primers, each cDNA strand carries information about the positionof the template RNA in the tissue section. The tissue section may bevisualised or imaged, e.g. stained and photographed, before or after thecDNA synthesis step to enable the positional tag in the cDNA molecule tobe correlated with a position within the tissue sample. The cDNA issequenced, which results in a transcriptome with exact positionalinformation. A schematic of the process is shown in FIG. 1. The sequencedata can then be matched to a position in the tissue sample, whichenables the visualization, e.g. using a computer, of the sequence datatogether with the tissue section, for instance to display the expressionpattern of any gene of interest across the tissue (FIG. 2). Similarly,it would be possible to mark different areas of the tissue section onthe computer screen and obtain information on differentially expressedgenes between any selected areas of interest. It will be evident thatthe methods of the invention result in data that is in stark contrast tothe data obtained using current methods to study mRNA populations. Forexample, methods based on in situ hybridization provide only relativeinformation of single mRNA transcripts. Thus, the methods of the presentinvention have clear advantages over current in situ technologies. Theglobal gene expression information obtainable from the methods of theinvention also allows co-expression information and quantitativeestimates of transcript abundance. It will be evident that this is agenerally applicable strategy available for the analysis of any tissuein any species, e.g. animal, plant, fungus.

As noted above, and described in more detail below, it will be evidentthat this basic methodology could readily be extended to the analysis ofgenomic DNA, e.g. to identify cells within a tissue sample that compriseone or more specific mutations. For instance, the genomic DNA may befragmented and allowed to hybridise to primers (equivalent to the RTprimers described above), which are capable of capturing the fragmentedDNA (e.g. an adapter with a sequence that is complementary to the primermay be ligated to the fragmented DNA or the fragmented DNA may beextended e.g. using an enzyme to incorporate additional nucleotides atthe end of the sequence, e.g. a poly-A tail, to generate a sequence thatis complementary to the primer) and priming the synthesis ofcomplementary strands to the capture molecules. The remaining steps ofthe analysis may be as described above. Hence, the specific embodimentsof the invention described below in the context of transcriptomeanalysis may also be employed in methods of analysing genomic DNA, whereappropriate.

It will be seen from the above explanation that there is an immensevalue in coupling positional information to transcriptome or genomeinformation. For instance, it enables global gene expression mapping athigh resolution, which will find utility in numerous applications,including e.g. cancer research and diagnostics.

Furthermore, it is evident that the methods described herein differsignificantly from the previously described methods for analysis of theglobal transcriptome of a tissue sample and these differences result innumerous advantages. The present invention is predicated on thesurprising discovery that the use of tissue sections does not interferewith synthesis of DNA (e.g. cDNA) primed by primers (e.g. reversetranscription primers) that are coupled to the surface of an array.

Thus, in its first and broadest aspect, the present invention provides amethod for localised detection of nucleic acid in a tissue samplecomprising:

(a) providing an array comprising a substrate on which multiple speciesof capture probes are directly or indirectly immobilized such that eachspecies occupies a distinct position on the array and is oriented tohave a free 3′ end to enable said probe to function as a primer for aprimer extension or ligation reaction, wherein each species of saidcapture probe comprises a nucleic acid molecule with 5′ to 3′:

(i) a positional domain that corresponds to the position of the captureprobe on the array, and

(ii) a capture domain;

(b) contacting said array with a tissue sample such that the position ofa capture probe on the array may be correlated with a position in thetissue sample and allowing nucleic acid of the tissue sample tohybridise to the capture domain in said capture probes;

(c) generating DNA molecules from the captured nucleic acid moleculesusing said capture probes as extension or ligation primers, wherein saidextended or ligated DNA molecules are tagged by virtue of the positionaldomain;

(d) optionally generating a complementary strand of said tagged DNAand/or optionally amplifying said tagged DNA;

(e) releasing at least part of the tagged DNA molecules and/or theircomplements or amplicons from the surface of the array, wherein saidpart includes the positional domain or a complement thereof;

(f) directly or indirectly analysing the sequence of the released DNAmolecules.

The methods of the invention represent a significant advance over othermethods for spatial transcriptomics known in the art. For example themethods described herein result in a global and spatial profile of alltranscripts in the tissue sample. Moreover, the expression of every genecan be quantified for each position or feature on the array, whichenables a multiplicity of analyses to be performed based on data from asingle assay. Thus, the methods of the present invention make itpossible to detect and/or quantify the spatial expression of all genesin single tissue sample. Moreover, as the abundance of the transcriptsis not visualised directly, e.g. by fluorescence, akin to a standardmicroarray, it is possible to measure the expression of genes in asingle sample simultaneously even wherein said transcripts are presentat vastly different concentrations in the same sample.

Accordingly, in a second and more particular aspect, the presentinvention can be seen to provide a method for determining and/oranalysing a transcriptome of a tissue sample comprising:

(a) providing an array comprising a substrate on which multiple speciesof capture probes are directly or indirectly immobilized such that eachspecies occupies a distinct position on the array and is oriented tohave a free 3′ end to enable said probe to function as a reversetranscriptase (RT) primer, wherein each species of said capture probecomprises a nucleic acid molecule with 5′ to 3′:

(i) a positional domain that corresponds to the position of the captureprobe on the array, and

(ii) a capture domain;

(b) contacting said array with a tissue sample such that the position ofa capture probe on the array may be correlated with a position in thetissue sample and allowing RNA of the tissue sample to hybridise to thecapture domain in said capture probes;

(c) generating cDNA molecules from the captured RNA molecules using saidcapture probes as RT primers, and optionally amplifying said cDNAmolecules;

(d) releasing at least part of the cDNA molecules and/or optionallytheir amplicons from the surface of the array, wherein said releasedmolecule may be a first strand and/or second strand cDNA molecule or anamplicon thereof and wherein said part includes the positional domain ora complement thereof;

(e) directly or indirectly analysing the sequence of the releasedmolecules.

As described in more detail below, any method of nucleic acid analysismay be used in the analysis step. Typically this may involve sequencing,but it is not necessary to perform an actual sequence determination. Forexample sequence-specific methods of analysis may be used. For example asequence-specific amplification reaction may be performed, for exampleusing primers which are specific for the positional domain and/or for aspecific target sequence, e.g. a particular target DNA to be detected(i.e. corresponding to a particular cDNA/RNA or gene etc.). An exemplaryanalysis method is a sequence-specific PCR reaction.

The sequence analysis information obtained in step (e) may be used toobtain spatial information as to the RNA in the sample. In other wordsthe sequence analysis information may provide information as to thelocation of the RNA in the sample. This spatial information may bederived from the nature of the sequence analysis information determined,for example it may reveal the presence of a particular RNA which mayitself be spatially informative in the context of the tissue sampleused, and/or the spatial information (e.g. spatial localisation) may bederived from the position of the tissue sample on the array, coupledwith the sequencing information. Thus, the method may involve simplycorrelating the sequence analysis information to a position in thetissue sample e.g. by virtue of the positional tag and its correlationto a position in the tissue sample. However, as described above, spatialinformation may conveniently be obtained by correlating the sequenceanalysis data to an image of the tissue sample and this represents onepreferred embodiment of the invention. Accordingly, in a preferredembodiment the method also includes a step of:

(f) correlating said sequence analysis information with an image of saidtissue sample, wherein the tissue sample is imaged before or after step(c).

In its broadest sense, the method of the invention may be used forlocalised detection of a nucleic acid in a tissue sample. Thus, in oneembodiment, the method of the invention may be used for determiningand/or analysing all of the transcriptome or genome of a tissue samplee.g. the global transcriptome of a tissue sample. However, the method isnot limited to this and encompasses determining and/or analysing all orpart of the transcriptome or genome. Thus, the method may involvedetermining and/or analysing a part or subset of the transcriptome orgenome, e.g. a transcriptome corresponding to a subset of genes, e.g. aset of particular genes, for example related to a particular disease orcondition, tissue type etc.

Viewed from another aspect, the method steps set out above can be seenas providing a method of obtaining a spatially defined transcriptome orgenome, and in particular the spatially defined global transcriptome orgenome, of a tissue sample.

Alternatively viewed, the method of the invention may be seen as amethod for localised or spatial detection of nucleic acid, whether DNAor RNA in a tissue sample, or for localised or spatial determinationand/or analysis of nucleic acid (DNA or RNA) in a tissue sample. Inparticular, the method may be used for the localised or spatialdetection or determination and/or analysis of gene expression or genomicvariation in a tissue sample. The localised/spatialdetection/determination/analysis means that the RNA or DNA may belocalised to its native position or location within a cell or tissue inthe tissue sample. Thus for example, the RNA or DNA may be localised toa cell or group of cells, or type of cells in the sample, or toparticular regions of areas within a tissue sample. The native locationor position of the RNA or DNA (or in other words, the location orposition of the RNA or DNA in the tissue sample), e.g. an expressed geneor genomic locus, may be determined.

The invention can also be seen to provide an array for use in themethods of the invention comprising a substrate on which multiplespecies of capture probes are directly or indirectly immobilized suchthat each species occupies a distinct position on the array and isoriented to have a free 3′ end to enable said probe to function as areverse transcriptase (RT) primer, wherein each species of said captureprobe comprises a nucleic acid molecule with 5′ to 3′:

(i) a positional domain that corresponds to the position of the captureprobe on the array, and

(ii) a capture domain to capture RNA of a tissue sample that iscontacted with said array.

In a related aspect, the present invention also provides use of anarray, comprising a substrate on which multiple species of capture probeare directly or indirectly immobilized such that each species occupies adistinct position on the array and is oriented to have a free 3′ end toenable said probe to function as a reverse transcriptase (RT) primer,wherein each species of said capture probe comprises a nucleic acidmolecule with 5′ to 3′:

(i) a positional domain that corresponds to the position of the captureprobe on the array; and

(ii) a capture domain;

to capture RNA of a tissue sample that is contacted with said array.

Preferably, said use is for determining and/or analysing a transcriptomeand in particular the global transcriptome, of a tissue sample andfurther comprises steps of:

(a) generating cDNA molecules from the captured RNA molecules using saidcapture probes as RT primers and optionally amplifying said cDNAmolecules;

(b) releasing at least part of the cDNA molecules and/or optionallytheir amplicons from the surface of the array, wherein said releasedmolecule may be a first strand and/or second strand cDNA molecule or anamplicon thereof and wherein said part includes the positional domain ora complement thereof;

(c) directly or indirectly analysing the sequence of the releasedmolecules; and optionally

(d) correlating said sequence analysis information with an image of saidtissue sample, wherein the tissue sample is imaged before or after step(a).

It will be seen therefore that the array of the present invention may beused to capture RNA, e.g. mRNA of a tissue sample that is contacted withsaid array.

The array may also be used for determining and/or analysing a partial orglobal transcriptome of a tissue sample or for obtaining a spatiallydefined partial or global transcriptome of a tissue sample. The methodsof the invention may thus be considered as methods of quantifying thespatial expression of one or more genes in a tissue sample. Expressedanother way, the methods of the present invention may be used to detectthe spatial expression of one or more genes in a tissue sample. In yetanother way, the methods of the present invention may be used todetermine simultaneously the expression of one or more genes at one ormore positions within a tissue sample. Still further, the methods may beseen as methods for partial or global transcriptome analysis of a tissuesample with two-dimensional spatial resolution.

The RNA may be any RNA molecule which may occur in a cell. Thus it maybe mRNA, tRNA, rRNA, viral RNA, small nuclear RNA (snRNA), smallnucleolar RNA (snoRNA), microRNA (miRNA), small interfering RNA (siRNA),piwi-interacting RNA (piRNA), ribozymal RNA, antisense RNA or non-codingRNA. Preferably however it is mRNA.

Step (c) in the method above (corresponding to step (a) in the preferredstatement of use set out above) of generating cDNA from the captured RNAwill be seen as relating to the synthesis of the cDNA. This will involvea step of reverse transcription of the captured RNA, extending thecapture probe, which functions as the RT primer, using the captured RNAas template. Such a step generates so-called first strand cDNA. As willbe described in more detail below, second strand cDNA synthesis mayoptionally take place on the array, or it may take place in a separatestep, after release of first strand cDNA from the array. As alsodescribed in more detail below, in certain embodiments second strandsynthesis may occur in the first step of amplification of a releasedfirst strand cDNA molecule.

Arrays for use in the context of nucleic acid analysis in general, andDNA analysis in particular, are discussed and described below. Specificdetails and embodiments described herein in relation to arrays andcapture probes for use in the context of RNA, apply equally (whereappropriate) to all such arrays, including those for use with DNA.

As used herein the term “multiple” means two or more, or at least two,e.g. 3, 5, 10, 15, 20, 30, 40, 50, 60, 70, 80, 90, 100, 150, 200, 400,500, 1000, 2000, 5000, 10,000, or more etc. Thus for example, the numberof capture probes may be any integer in any range between any two of theaforementioned numbers. It will be appreciated however that it isenvisaged that conventional-type arrays with many hundreds, thousands,tens of thousands, hundreds of thousands or even millions of captureprobes may be used.

Thus, the methods outlined herein utilise high density nucleic acidarrays comprising “capture probes” for capturing and labellingtranscripts from all of the single cells within a tissue sample e.g. athin tissue sample slice, or “section”. The tissue samples or sectionsfor analysis are produced in a highly parallelized fashion, such thatthe spatial information in the section is retained. The captured RNA(preferably mRNA) molecules for each cell, or “transcriptomes”, aretranscribed into cDNA and the resultant cDNA molecules are analyzed, forexample by high throughput sequencing. The resultant data may becorrelated to images of the original tissue samples e.g. sectionsthrough so-called barcode sequences (or ID tags, defined herein aspositional domains) incorporated into the arrayed nucleic acid probes.

High density nucleic acid arrays or microarrays are a core component ofthe spatial transcriptome labelling method described herein. Amicroarray is a multiplex technology used in molecular biology. Atypical microarray consists of an arrayed series of microscopic spots ofoligonucleotides (hundreds of thousands of spots, generally tens ofthousands, can be incorporated on a single array). The distinct positionof each nucleic acid (oligonucleotide) spot (each species ofoligonucleotide/nucleic acid molecule) is known as a “feature” (andhence in the methods set out above each species of capture probe may beviewed as a specific feature of the array; each feature occupies adistinct position on the array), and typically each separate featurecontains in the region of picomoles (10⁻¹² moles) of a specific DNAsequence (a “species”), which are known as “probes” (or “reporters”).Typically, these can be a short section of a gene or other nucleic acidelement to which a cDNA or cRNA sample (or “target”) can hybridize underhigh-stringency hybridization conditions. However, as described below,the probes of the present invention differ from the probes of standardmicroarrays.

In gene expression microarrays, probe-target hybridization is usuallydetected and quantified by detection of visual signal, e.g. afluorophore, silver ion, or chemiluminescence-label, which has beenincorporated into all of the targets. The intensity of the visual signalcorrelates to the relative abundance of each target nucleic acid in thesample. Since an array can contain tens of thousands of probes, amicroarray experiment can accomplish many genetic tests in parallel.

In standard microarrays, the probes are attached to a solid surface orsubstrate by a covalent bond to a chemical matrix, e.g. epoxy-silane,amino-silane, lysine, polyacrylamide etc. The substrate typically is aglass, plastic or silicon chip or slide, although other microarrayplatforms are known, e.g. microscopic beads.

The probes may be attached to the array of the invention by any suitablemeans. In a preferred embodiment the probes are immobilized to thesubstrate of the array by chemical immobilization. This may be aninteraction between the substrate (support material) and the probe basedon a chemical reaction. Such a chemical reaction typically does not relyon the input of energy via heat or light, but can be enhanced by eitherapplying heat, e.g. a certain optimal temperature for a chemicalreaction, or light of certain wavelength. For example, a chemicalimmobilization may take place between functional groups on the substrateand corresponding functional elements on the probes. Such correspondingfunctional elements in the probes may either be an inherent chemicalgroup of the probe, e.g. a hydroxyl group or be additionally introduced.An example of such a functional group is an amine group. Typically, theprobe to be immobilized comprises a functional amine group or ischemically modified in order to comprise a functional amine group. Meansand methods for such a chemical modification are well known.

The localization of said functional group within the probe to beimmobilized may be used in order to control and shape the bindingbehaviour and/or orientation of the probe, e.g. the functional group maybe placed at the 5′ or 3′ end of the probe or within sequence of theprobe. A typical substrate for a probe to be immobilized comprisesmoieties which are capable of binding to such probes, e.g. toamine-functionalized nucleic acids. Examples of such substrates arecarboxy, aldehyde or epoxy substrates. Such materials are known to theperson skilled in the art. Functional groups, which impart a connectingreaction between probes which are chemically reactive by theintroduction of an amine group, and array substrates are known to theperson skilled in the art.

Alternative substrates on which probes may be immobilized may have to bechemically activated, e.g. by the activation of functional groups,available on the array substrate. The term “activated substrate” relatesto a material in which interacting or reactive chemical functionalgroups were established or enabled by chemical modification proceduresas known to the person skilled in the art. For example, a substratecomprising carboxyl groups has to be activated before use. Furthermore,there are substrates available that contain functional groups that canreact with specific moieties already present in the nucleic acid probes.

Alternatively, the probes may be synthesized directly on the substrate.Suitable methods for such an approach are known to the person skilled inthe art. Examples are manufacture techniques developed by Agilent Inc.,Affymetrix Inc., Roche Nimblegen Inc. or Flexgen BV. Typically, lasersand a set of mirrors that specifically activate the spots wherenucleotide additions are to take place are used. Such an approach mayprovide, for example, spot sizes (i.e. features) of around 30 μm orlarger.

The substrate therefore may be any suitable substrate known to theperson skilled in the art. The substrate may have any suitable form orformat, e.g. it may be flat, curved, e.g. convexly or concavely curvedtowards the area where the interaction between the tissue sample and thesubstrate takes place. Particularly preferred is the where the substrateis a flat, i.e. planar, chip or slide.

Typically, the substrate is a solid support and thereby allows for anaccurate and traceable positioning of the probes on the substrate. Anexample of a substrate is a solid material or a substrate comprisingfunctional chemical groups, e.g. amine groups or amine-functionalizedgroups. A substrate envisaged by the present invention is a non-poroussubstrate. Preferred non-porous substrates are glass, silicon,poly-L-lysine coated material, nitrocellulose, polystyrene, cyclicolefin copolymers (COCs), cyclic olefin polymers (COPs), polypropylene,polyethylene and polycarbonate.

Any suitable material known to the person skilled in the art may beused. Typically, glass or polystyrene is used. Polystyrene is ahydrophobic material suitable for binding negatively chargedmacromolecules because it normally contains few hydrophilic groups. Fornucleic acids immobilized on glass slides, it is furthermore known thatby increasing the hydrophobicity of the glass surface the nucleic acidimmobilization may be increased. Such an enhancement may permit arelatively more densely packed formation. In addition to a coating orsurface treatment with poly-L-lysine, the substrate, in particularglass, may be treated by silanation, e.g. with epoxy-silane oramino-silane or by silynation or by a treatment with polyacrylamide.

A number of standard arrays are commercially available and both thenumber and size of the features may be varied. In the present invention,the arrangement of the features may be altered to correspond to the sizeand/or density of the cells present in different tissues or organisms.For instance, animal cells typically have a cross-section in the regionof 1-100 μm, whereas the cross-section of plant cells typically mayrange from 1-10000 μm. Hence, Nimblegen® arrays, which are availablewith up to 2.1 million features, or 4.2 million features, and featuresizes of 13 micrometers, may be preferred for tissue samples from ananimal or fungus, whereas other formats, e.g. with 8×130 k features, maybe sufficient for plant tissue samples. Commercial arrays are alsoavailable or known for use in the context of sequence analysis and inparticular in the context of NGS technologies. Such arrays may also beused as the array surface in the context of the present invention e.g.an Illumina bead array. In addition to commercially available arrays,which can themselves be customized, it is possible to make custom ornon-standard “in-house” arrays and methods for generating arrays arewell-established. The methods of the invention may utilise both standardand non-standard arrays that comprise probes as defined below.

The probes on a microarray may be immobilized, i.e. attached or bound,to the array preferably via the 5′ or 3′ end, depending on the chemicalmatrix of the array. Typically, for commercially available arrays, theprobes are attached via a 3′ linkage, thereby leaving a free 5′ end.However, arrays comprising probes attached to the substrate via a 5′linkage, thereby leaving a free 3′ end, are available and may besynthesized using standard techniques that are well known in the art andare described elsewhere herein.

The covalent linkage used to couple a nucleic acid probe to an arraysubstrate may be viewed as both a direct and indirect linkage, in thatthe although the probe is attached by a “direct” covalent bond, theremay be a chemical moiety or linker separating the “first” nucleotide ofthe nucleic acid probe from the, e.g. glass or silicon, substrate i.e.an indirect linkage. For the purposes of the present invention probesthat are immobilized to the substrate by a covalent bond and/or chemicallinker are generally seen to be immobilized or attached directly to thesubstrate.

As will be described in more detail below, the capture probes of theinvention may be immobilized on, or interact with, the array directly orindirectly. Thus the capture probes need not bind directly to the array,but may interact indirectly, for example by binding to a molecule whichitself binds directly or indirectly to the array (e.g. the capture probemay interact with (e.g. bind or hybridize to) a binding partner for thecapture probe, i.e. a surface probe, which is itself bound to the arraydirectly or indirectly). Generally speaking, however, the capture probewill be, directly or indirectly (by one or more intermediaries), boundto, or immobilized on, the array.

The use, method and array of the invention may comprise probes that areimmobilized via their 5′ or 3′ end. However, when the capture probe isimmobilized directly to the array substrate, it may be immobilized onlysuch that the 3′ end of the capture probe is free to be extended, e.g.it is immobilized by its 5′ end. The capture probe may be immobilizedindirectly, such that it has a free, i.e. extendible, 3′ end.

By extended or extendible 3′ end, it is meant that further nucleotidesmay be added to the most 3′ nucleotide of the nucleic acid molecule,e.g. capture probe, to extend the length of the nucleic acid molecule,i.e. the standard polymerization reaction utilized to extend nucleicacid molecules, e.g. templated polymerization catalyzed by a polymerase.

Thus, in one embodiment, the array comprises probes that are immobilizeddirectly via their 3′ end, so-called surface probes, which are definedbelow. Each species of surface probe comprises a region ofcomplementarity to each species of capture probe, such that the captureprobe may hybridize to the surface probe, resulting in the capture probecomprising a free extendible 3′ end. In a preferred aspect of theinvention, when the array comprises surface probes, the capture probesare synthesized in situ on the array.

The array probes may be made up of ribonucleotides and/ordeoxyribonucleotides as well as synthetic nucleotide residues that arecapable of participating in Watson-Crick type or analogous base pairinteractions. Thus, the nucleic acid domain may be DNA or RNA or anymodification thereof e.g. PNA or other derivatives containingnon-nucleotide backbones. However, in the context of transcriptomeanalysis the capture domain of the capture probe must capable of priminga reverse transcription reaction to generate cDNA that is complementaryto the captured RNA molecules. As described below in more detail, in thecontext of genome analysis, the capture domain of the capture probe mustbe capable of binding to the DNA fragments, which may comprise bindingto a binding domain that has been added to the fragmented DNA. In someembodiments, the capture domain of the capture probe may prime a DNAextension (polymerase) reaction to generate DNA that is complementary tothe captured DNA molecules. In other embodiments, the capture domain maytemplate a ligation reaction between the captured DNA molecules and asurface probe that is directly or indirectly immobilised on thesubstrate. In yet other embodiments, the capture domain may be ligatedto one strand of the captured DNA molecules.

In a preferred embodiment of the invention at least the capture domainof the capture probe comprises or consists of deoxyribonucleotides(dNTPs). In a particularly preferred embodiment the whole of the captureprobe comprises or consists of deoxyribonucleotides.

In a preferred embodiment of the invention the capture probes areimmobilized on the substrate of the array directly, i.e. by their 5′end, resulting in a free extendible 3′ end.

The capture probes of the invention comprise at least two domains, acapture domain and a positional domain (or a feature identification tagor domain; the positional domain may alternatively be defined as anidentification (ID) domain or tag, or as a positional tag). The captureprobe may further comprise a universal domain as defined further below.Where the capture probe is indirectly attached to the array surface viahybridization to a surface probe, the capture probe requires a sequence(e.g. a portion or domain) which is complementary to the surface probe.

Such a complementary sequence may be complementary to apositional/identification domain and/or a universal domain on thesurface probe. In other words the positional domain and/or universaldomain may constitute the region or portion of the probe which iscomplementary to the surface probe. However, the capture probe may alsocomprise an additional domain (or region, portion or sequence) which iscomplementary to the surface probe. For ease of synthesis, as describedin more detail below, such a surface probe-complementary region may beprovided as part, or as an extension of the capture domain (such a partor extension not itself being used for, or capable of, binding to thetarget nucleic acid, e.g. RNA).

The capture domain is typically located at the 3′ end of the captureprobe and comprises a free 3′ end that can be extended, e.g. by templatedependent polymerization. The capture domain comprises a nucleotidesequence that is capable of hybridizing to nucleic acid, e.g. RNA(preferably mRNA) present in the cells of the tissue sample contact withthe array.

Advantageously, the capture domain may be selected or designed to bind(or put more generally may be capable of binding) selectively orspecifically to the particular nucleic acid, e.g. RNA it is desired todetect or analyse. For example the capture domain may be selected ordesigned for the selective capture of mRNA. As is well known in the art,this may be on the basis of hybridisation to the poly-A tail of mRNA.Thus, in a preferred embodiment the capture domain comprises a poly-TDNA oligonucleotide, i.e. a series of consecutive deoxythymidineresidues linked by phosphodiester bonds, which is capable of hybridizingto the poly-A tail of mRNA. Alternatively, the capture domain maycomprise nucleotides which are functionally or structurally analogous topoly-T i.e., are capable of binding selectively to poly-A, for example apoly-U oligonucleotide or an oligonucleotide comprised of deoxythymidineanalogues, wherein said oligonucleotide retains the functional propertyof binding to poly-A. In a particularly preferred embodiment the capturedomain, or more particularly the poly-T element of the capture domain,comprises at least 10 nucleotides, preferably at least 11, 12, 13, 14,15, 16, 17, 18, 19 or 20 nucleotides. In a further embodiment, thecapture domain, or more particularly the poly-T element of the capturedomain comprises at least 25, 30 or 35 nucleotides.

Random sequences may also be used in the capture of nucleic acid, as isknown in the art, e.g. random hexamers or similar sequences, and hencesuch random sequences may be used to form all or a part of the capturedomain. For example, random sequences may be used in conjunction withpoly-T (or poly-T analogue etc.) sequences. Thus where a capture domaincomprises a poly-T (or a “poly-T-like”) oligonucleotide, it may alsocomprise a random oligonucleotide sequence. This may for example belocated 5′ or 3′ of the poly-T sequence, e.g. at the 3′ end of thecapture probe, but the positioning of such a random sequence is notcritical. Such a construct may facilitate the capturing of the initialpart of the poly-A of mRNA. Alternatively, the capture domain may be anentirely random sequence. Degenerate capture domains may also be used,according to principles known in the art.

The capture domain may be capable of binding selectively to a desiredsub-type or subset of nucleic acid, e.g. RNA, for example a particulartype of RNA such mRNA or rRNA etc. as listed above, or to a particularsubset of a given type of RNA, for example, a particular mRNA speciese.g. corresponding to a particular gene or group of genes. Such acapture probe may be selected or designed based on sequence of the RNAit is desired to capture. Thus it may be a sequence-specific captureprobe, specific for a particular RNA target or group of targets (targetgroup etc). Thus, it may be based on a particular gene sequence orparticular motif sequence or common/conserved sequence etc., accordingto principles well known in the art.

In embodiments where the capture probe is immobilized on the substrateof the array indirectly, e.g. via hybridization to a surface probe, thecapture domain may further comprise an upstream sequence (5′ to thesequence that hybridizes to the nucleic acid, e.g. RNA of the tissuesample) that is capable of hybridizing to 5′ end of the surface probe.Alone, the capture domain of the capture probe may be seen as a capturedomain oligonucleotide, which may be used in the synthesis of thecapture probe in embodiments where the capture probe is immobilized onthe array indirectly.

The positional domain (feature identification domain or tag) of thecapture probe is located directly or indirectly upstream, i.e. closer tothe 5′ end of the capture probe nucleic acid molecule, of the capturedomain. Preferably the positional domain is directly adjacent to thecapture domain, i.e. there is no intermediate sequence between thecapture domain and the positional domain. In some embodiments thepositional domain forms the 5′ end of the capture probe, which may beimmobilized directly or indirectly on the substrate of the array.

As discussed above, each feature (distinct position) of the arraycomprises a spot of a species of nucleic acid probe, wherein thepositional domain at each feature is unique. Thus, a “species” ofcapture probe is defined with reference to its positional domain; asingle species of capture probe will have the same positional domain.However, it is not required that each member of a species of captureprobe has the same sequence in its entirety. In particular, since thecapture domain may be or may comprise a random or degenerate sequence,the capture domains of individual probes within a species may vary.Accordingly, in some embodiments where the capture domains of thecapture probes are the same, each feature comprises a single probesequence. However in other embodiments where the capture probe varies,members of a species of probe will not have the exact same sequence,although the sequence of the positional domain of each member in thespecies will be the same. What is required is that each feature orposition of the array carries a capture probe of a single species(specifically each feature or position carries a capture probe which hasan identical positional tag, i.e. there is a single positional domain ateach feature or position). Each species has a different positionaldomain which identifies the species. However, each member of a species,may in some cases, as described in more detail herein, have a differentcapture domain, as the capture domain may be random or degenerate or mayhave a random or degenerate component. This means that within a givenfeature, or position, the capture domain of the probes may differ.

Thus in some, but not necessarily in all embodiments, the nucleotidesequence of any one probe molecule immobilized at a particular featureis the same as the other probe molecules immobilized at the samefeature, but the nucleotide sequence of the probes at each feature isdifferent, distinct or distinguishable from the probes immobilized atevery other feature. Preferably each feature comprises a differentspecies of probe. However, in some embodiments it may be advantageousfor a group of features to comprise the same species of probe, i.e.effectively to produce a feature covering an area of the array that isgreater than a single feature, e.g. to lower the resolution of thearray. In other embodiments of the array, the nucleotide sequence of thepositional domain of any one probe molecule immobilized at a particularfeature may be the same as the other probe molecules immobilized at thesame feature but the capture domain may vary. The capture domain maynonetheless be designed to capture the same type of molecule, e.g. mRNAin general.

The positional domain (or tag) of the capture probe comprises thesequence which is unique to each feature and acts as a positional orspatial marker (the identification tag). In this way each region ordomain of the tissue sample, e.g. each cell in the tissue, will beidentifiable by spatial resolution across the array linking the nucleicacid, e.g. RNA (e.g. the transcripts) from a certain cell to a uniquepositional domain sequence in the capture probe. By virtue of thepositional domain a capture probe in the array may be correlated to aposition in the tissue sample, for example it may be correlated to acell in the sample. Thus, the positional domain of the capture domainmay be seen as a nucleic acid tag (identification tag).

Any suitable sequence may be used as the positional domain in thecapture probes of the invention. By a suitable sequence, it is meantthat the positional domain should not interfere with (i.e. inhibit ordistort) the interaction between the RNA of the tissue sample and thecapture domain of the capture probe. For example, the positional domainshould be designed such that nucleic acid molecules in the tissue sampledo not hybridize specifically to the positional domain. Preferably, thenucleic acid sequence of the positional domain of the capture probes hasless than 80% sequence identity to the nucleic acid sequences in thetissue sample. Preferably, the positional domain of the capture probehas less than 70%, 60%, 50% or less than 40% sequence identity across asubstantial part of the nucleic acids molecules in the tissue sample.Sequence identity may be determined by any appropriate method known inthe art, e.g. the using BLAST alignment algorithm.

In a preferred embodiment the positional domain of each species ofcapture probe contains a unique barcode sequence. The barcode sequencesmay be generated using random sequence generation. The randomlygenerated sequences may be followed by stringent filtering by mapping tothe genomes of all common reference species and with pre-set Tmintervals, GC content and a defined distance of difference to the otherbarcode sequences to ensure that the barcode sequences will notinterfere with the capture of the nucleic acid, e.g. RNA from the tissuesample and will be distinguishable from each other without difficulty.

As mentioned above, and in a preferred embodiment, the capture probecomprises also a universal domain (or linker domain or tag). Theuniversal domain of the capture probe is located directly or indirectlyupstream, i.e. closer to the 5′ end of the capture probe nucleic acidmolecule, of the positional domain. Preferably the universal domain isdirectly adjacent to the positional domain, i.e. there is nointermediate sequence between the positional domain and the universaldomain. In embodiments where the capture probe comprises a universaldomain, the domain will form the 5′ end of the capture probe, which maybe immobilized directly or indirectly on the substrate of the array.

The universal domain may be utilized in a number of ways in the methodsand uses of the invention. For example, the methods of the inventioncomprise a step of releasing (e.g. removing) at least part of thesynthesised (i.e. extended or ligated) nucleic acid, e.g. cDNA moleculesfrom the surface of the array. As described elsewhere herein, this maybe achieved in a number of ways, of which one comprises cleaving thenucleic acid, e.g. cDNA molecule from the surface of the array. Thus,the universal domain may itself comprise a cleavage domain, i.e. asequence that can be cleaved specifically, either chemically orpreferably enzymatically.

Thus, the cleavage domain may comprise a sequence that is recognised byone or more enzymes capable of cleaving a nucleic acid molecule, i.e.capable of breaking the phosphodiester linkage between two or morenucleotides. For instance, the cleavage domain may comprise arestriction endonuclease (restriction enzyme) recognition sequence.Restriction enzymes cut double-stranded or single stranded DNA atspecific recognition nucleotide sequences known as restriction sites andsuitable enzymes are well known in the art. For example, it isparticularly advantageous to use rare-cutting restriction enzymes, i.e.enzymes with a long recognition site (at least 8 base pairs in length),to reduce the possibility of cleaving elsewhere in the nucleic acid,e.g. cDNA molecule. In this respect, it will be seen that removing orreleasing at least part of the nucleic acid, e.g. cDNA molecule requiresreleasing a part comprising the positional domain of the nucleic acid,e.g. cDNA and all of the sequence downstream of the domain, i.e. all ofthe sequence that is 3′ to the positional domain. Hence, cleavage of thenucleic acid, e.g. cDNA molecule should take place 5′ to the positionaldomain.

By way of example, the cleavage domain may comprise a poly-U sequencewhich may be cleaved by a mixture of Uracil DNA glycosylase (UDG) andthe DNA glycosylase-lyase Endonuclease VIII, commercially known as theUSER™ enzyme.

A further example of a cleavage domain can be utilised in embodimentswhere the capture probe is immobilized to the array substrateindirectly, i.e. via a surface probe. The cleavage domain may compriseone or more mismatch nucleotides, i.e. when the complementary parts ofthe surface probe and the capture probe are not 100% complementary. Sucha mismatch is recognised, e.g. by the MutY and T7 endonuclease Ienzymes, which results in cleavage of the nucleic acid molecule at theposition of the mismatch.

In some embodiments of the invention, the positional domain of thecapture probe comprises a cleavage domain, wherein the said cleavagedomain is located at the 5′ end of the positional domain.

The universal domain may comprise also an amplification domain. This maybe in addition to, or instead of, a cleavage domain. In some embodimentsof the invention, as described elsewhere herein, it may be advantageousto amplify the nucleic acid, e.g. cDNA molecules, for example after theyhave been released (e.g. removed or cleaved) from the array substrate.It will be appreciated however, that the initial cycle of amplification,or indeed any or all further cycles of amplification may also take placein situ on the array. The amplification domain comprises a distinctsequence to which an amplification primer may hybridize. Theamplification domain of the universal domain of the capture probe ispreferably identical for each species of capture probe. Hence a singleamplification reaction will be sufficient to amplify all of the nucleicacid, e.g. cDNA molecules (which may or may not be released from thearray substrate prior to amplification).

Any suitable sequence may be used as the amplification domain in thecapture probes of the invention. By a suitable sequence, it is meantthat the amplification domain should not interfere with (i.e. inhibit ordistort) the interaction between the nucleic acid, e.g. RNA of thetissue sample and the capture domain of the capture probe. Furthermore,the amplification domain should comprise a sequence that is not the sameor substantially the same as any sequence in the nucleic acid, e.g. RNAof the tissue sample, such that the primer used in the amplificationreaction can hybridized only to the amplification domain under theamplification conditions of the reaction.

For example, the amplification domain should be designed such thatnucleic acid molecules in the tissue sample do not hybridizespecifically to the amplification domain or the complementary sequenceof the amplification domain. Preferably, the nucleic acid sequence ofthe amplification domain of the capture probes and the complementthereof has less than 80% sequence identity to the nucleic acidsequences in the tissue sample. Preferably, the positional domain of thecapture probe has less than 70%, 60%, 50% or less than 40% sequenceidentity across a substantial part of the nucleic acid molecules in thetissue sample. Sequence identity may be determined by any appropriatemethod known in the art, e.g. the using BLAST alignment algorithm.

Thus, alone, the universal domain of the capture probe may be seen as auniversal domain oligonucleotide, which may be used in the synthesis ofthe capture probe in embodiments where the capture probe is immobilizedon the array indirectly.

In one representative embodiment of the invention only the positionaldomain of each species of capture probe is unique. Hence, the capturedomains and universal domains (if present) are in one embodiment thesame for every species of capture probe for any particular array toensure that the capture of the nucleic acid, e.g. RNA from the tissuesample is uniform across the array. However, as discussed above, in someembodiments the capture domains may differ by virtue of including randomor degenerate sequences.

In embodiments where the capture probe is immobilized on the substrateof the array indirectly, e.g. via hybridisation to a surface probe, thecapture probe may be synthesised on the array as described below.

The surface probes are immobilized on the substrate of the arraydirectly by or at, e.g. their 3′ end. Each species of surface probe isunique to each feature (distinct position) of the array and is partlycomplementary to the capture probe, defined above.

Hence the surface probe comprises at its 5′ end a domain (complementarycapture domain) that is complementary to a part of the capture domainthat does not bind to the nucleic acid, e.g. RNA of the tissue sample.In other words, it comprises a domain that can hybridize to at leastpart of a capture domain oligonucleotide. The surface probe furthercomprises a domain (complementary positional domain or complementaryfeature identification domain) that is complementary to the positionaldomain of the capture probe. The complementary positional domain islocated directly or indirectly downstream (i.e. at the 3′ end) of thecomplementary capture domain, i.e. there may be an intermediary orlinker sequence separating the complementary positional domain and thecomplementary capture domain. In embodiments where the capture probe issynthesized on the array surface, the surface probes of the array alwayscomprise a domain (complementary universal domain) at the 3′ end of thesurface probe, i.e. directly or indirectly downstream of the positionaldomain, which is complementary to the universal domain of the captureprobe. In other words, it comprises a domain that can hybridize to atleast part of the universal domain oligonucleotide.

In some embodiments of the invention the sequence of the surface probeshows 100% complementarity or sequence identity to the positional anduniversal domains and to the part of the capture domain that does notbind to the nucleic acid, e.g. RNA of the tissue sample. In otherembodiments the sequence of the surface probe may show less than 100%sequence identity to the domains of the capture probe, e.g. less than99%, 98%, 97%, 96%, 95%, 94%, 93%, 92%, 91% or 90%. In a particularlypreferred embodiment of the invention, the complementary universaldomain shares less than 100% sequence identity to the universal domainof the capture probe.

In one embodiment of the invention, the capture probe is synthesized orgenerated on the substrate of the array. In a representative embodiment(see FIG. 3), the array comprises surface probes as defined above.Oligonucleotides that correspond to the capture domain and universaldomain of the capture probe are contacted with the array and allowed tohybridize to the complementary domains of the surface probes. Excessoligonucleotides may be removed by washing the array under standardhybridization conditions. The resultant array comprises partially singlestranded probes, wherein both the 5′ and 3′ ends of the surface probeare double stranded and the complementary positional domain is singlestranded. The array may be treated with a polymerase enzyme to extendthe 3′ end of the universal domain oligonucleotide, in a templatedependent manner, so as to synthesize the positional domain of thecapture probe. The 3′ end of the synthesized positional domain is thenligated, e.g. using a ligase enzyme, to the 5′ end of the capture domainoligonucleotide to generate the capture probe. It will be understood inthis regard that the 5′ end of the capture domain oligonucleotide isphosphorylated to enable ligation to take place. As each species ofsurface probe comprises a unique complementary positional domain, eachspecies of capture probe will comprise a unique positional domain.

The term “hybridisation” or “hybridises” as used herein refers to theformation of a duplex between nucleotide sequences which aresufficiently complementary to form duplexes via Watson-Crick basepairing. Two nucleotide sequences are “complementary” to one anotherwhen those molecules share base pair organization homology.“Complementary” nucleotide sequences will combine with specificity toform a stable duplex under appropriate hybridization conditions.

For instance, two sequences are complementary when a section of a firstsequence can bind to a section of a second sequence in an anti-parallelsense wherein the 3′-end of each sequence binds to the 5′-end of theother sequence and each A, T(U), G and C of one sequence is then alignedwith a T(U), A, C and G, respectively, of the other sequence. RNAsequences can also include complementary G=U or U-G base pairs. Thus,two sequences need not have perfect homology to be “complementary” underthe invention. Usually two sequences are sufficiently complementary whenat least about 90% (preferably at least about 95%) of the nucleotidesshare base pair organization over a defined length of the molecule.

The domains of the capture and surface probes thus contain a region ofcomplementarity. Furthermore the capture domain of the capture probecontains a region of complementarity for the nucleic acid, e.g. RNA(preferably mRNA) of the tissue sample.

The capture probe may also be synthesised on the array substrate usingpolymerase extension (similarly to as described above) and a terminaltransferase enzyme to add a “tail” which may constitute the capturedomain. This is described further in Example 7 below. The use ofterminal transferases to add nucleotide sequences to the end of anoligonucleotide is known in the art, e.g. to introduce a homopolymerictail e.g. a poly-T tail. Accordingly, in such a synthesis anoligonucleotide that corresponds to the universal domain of the captureprobe may be contacted with the array and allowed to hybridize to thecomplementary domain of the surface probes. Excess oligonucleotides maybe removed by washing the array under standard hybridization conditions.The resultant array comprises partially single stranded probes, whereinthe 5′ ends of the surface probes are double stranded and thecomplementary positional domain is single stranded. The array may betreated with a polymerase enzyme to extend the 3′ end of the universaldomain oligonucleotide, in a template dependent manner, so as tosynthesize the positional domain of the capture probe. The capturedomain, e.g. comprising a poly-T sequence may then be introduced using aterminal transferase to add a poly-T tail to generate the capture probe.

The typical array of, and for use in the methods of, the invention maycontain multiple spots, or “features”. A feature may be defined as anarea or distinct position on the array substrate at which a singlespecies of capture probe is immobilized. Hence each feature willcomprise a multiplicity of probe molecules, of the same species. It willbe understood in this context that whilst it is encompassed that eachcapture probe of the same species may have the same sequence, this neednot necessarily be the case. Each species of capture probe will have thesame positional domain (i.e. each member of a species and hence eachprobe in a feature will be identically “tagged”), but the sequence ofeach member of the feature (species) may differ, because the sequence ofa capture domain may differ.

As described above, random or degenerate capture domains may be used.Thus the capture probes within a feature may comprise different randomor degenerate sequences. The number and density of the features on thearray will determine the resolution of the array, i.e. the level ofdetail at which the transcriptome or genome of the tissue sample can beanalysed. Hence, a higher density of features will typically increasethe resolution of the array.

As discussed above, the size and number of the features on the array ofthe invention will depend on the nature of the tissue sample andrequired resolution.

Thus, if it is desirable to determine a transcriptome or genome only forregions of cells within a tissue sample (or the sample contains largecells) then the number and/or density of features on the array may bereduced (i.e. lower than the possible maximum number of features) and/orthe size of the features may be increased (i.e. the area of each featuremay be greater than the smallest possible feature), e.g. an arraycomprising few large features. Alternatively, if it is desirable todetermine a transcriptome or genome of individual cells within a sample,it may be necessary to use the maximum number of features possible,which would necessitate using the smallest possible feature size, e.g.an array comprising many small features.

Whilst single cell resolution may be a preferred and advantageousfeature of the present invention, it is not essential to achieve this,and resolution at the cell group level is also of interest, for exampleto detect or distinguish a particular cell type or tissue region, e.g.normal vs tumour cells.

In representative embodiments of the invention, an array may contain atleast 2, 5, 10, 50, 100, 500, 750, 1000, 1500, 3000, 5000, 10000, 20000,40000, 50000, 75000, 100000, 150000, 200000, 300000, 400000, 500000,750000, 800000, 1000000, 1200000, 1500000, 1750000, 2000000, 2100000.3000000, 3500000, 4000000 or 4200000 features. Whilst 4200000 representsthe maximum number of features presently available on a commercialarray, it is envisaged that arrays with features in excess of this maybe prepared and such arrays are of interest in the present invention. Asnoted above, feature size may be decreased and this may allow greaternumbers of features to be accommodated within the same or a similararea. By way of example. these features may be comprised in an area ofless than about 20 cm², 10 cm², 5 cm², 1 cm², 1 mm², or 100 μm².

Thus, in some embodiments of the invention the area of each feature maybe from about 1 μm², 2 μm², 3 μm², 4 μm², 5 μm², 10 μm², 12 μm², 15 μm²,20 μm², 50 μm², 75 μm², 100 μm², 150 μm², 200 μm², 250 μm², 300 μm², 400μm², or 500 μm².

It will be evident that a tissue sample from any organism could be usedin the methods of the invention, e.g. plant, animal or fungal. The arrayof the invention allows the capture of any nucleic acid, e.g. mRNAmolecules, which are present in cells that are capable of transcriptionand/or translation. The arrays and methods of the invention areparticularly suitable for isolating and analysing the transcriptome orgenome of cells within a sample, wherein spatial resolution of thetranscriptomes or genomes is desirable, e.g. where the cells areinterconnected or in contact directly with adjacent cells. However, itwill be apparent to a person of skill in the art that the methods of theinvention may also be useful for the analysis of the transcriptome orgenome of different cells or cell types within a sample even if saidcells do not interact directly, e.g. a blood sample. In other words, thecells do not need to present in the context of a tissue and can beapplied to the array as single cells (e.g. cells isolated from anon-fixed tissue). Such single cells, whilst not necessarily fixed to acertain position in a tissue, are nonetheless applied to a certainposition on the array and can be individually identified. Thus, in thecontext of analysing cells that do not interact directly, or are notpresent in a tissue context, the spatial properties of the describedmethods may be applied to obtaining or retrieving unique or independenttranscriptome or genome information from individual cells.

The sample may thus be a harvested or biopsied tissue sample, orpossibly a cultured sample. Representative samples include clinicalsamples e.g. whole blood or blood-derived products, blood cells,tissues, biopsies, or cultured tissues or cells etc. including cellsuspensions. Artificial tissues may for example be prepared from cellsuspension (including for example blood cells). Cells may be captured ina matrix (for example a gel matrix e.g. agar, agarose, etc) and may thenbe sectioned in a conventional way. Such procedures are known in the artin the context of immunohistochemistry (see e.g. Andersson et al 2006,J. Histochem. Cytochem. 54(12): 1413-23. Epub 2006 Sep. 6).

The mode of tissue preparation and how the resulting sample is handledmay effect the transcriptomic or genomic analysis of the methods of theinvention.

Moreover, various tissue samples will have different physicalcharacteristics and it is well within the skill of a person in the artto perform the necessary manipulations to yield a tissue sample for usewith the methods of the invention. However, it is evident from thedisclosures herein that any method of sample preparation may be used toobtain a tissue sample that is suitable for use in the methods of theinvention. For instance any layer of cells with a thickness ofapproximately 1 cell or less may be used in the methods of theinvention. In one embodiment, the thickness of the tissue sample may beless than 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2 or 0.1 of thecross-section of a cell. However, since as noted above, the presentinvention is not limited to single cell resolution and hence it is not arequirement that the tissue sample has a thickness of one cell diameteror less; thicker tissue samples may if desired be used. For examplecryostat sections may be used, which may be e.g. 10-20 μm thick.

The tissue sample may be prepared in any convenient or desired way andthe invention is not restricted to any particular type of tissuepreparation. Fresh, frozen, fixed or unfixed tissues may be used. Anydesired convenient procedure may be used for fixing or embedding thetissue sample, as described and known in the art. Thus any knownfixatives or embedding materials may be used.

As a first representative example of a tissue sample for use in theinvention, the tissue may prepared by deep freezing at temperaturesuitable to maintain or preserve the integrity (i.e. the physicalcharacteristics) of the tissue structure, e.g. less than −20° C. andpreferably less than −25, −30, −40, −50, −60, −70 or −80° C. The frozentissue sample may be sectioned, i.e. thinly sliced, onto the arraysurface by any suitable means. For example, the tissue sample may beprepared using a chilled microtome, a cryostat, set at a temperaturesuitable to maintain both the structural integrity of the tissue sampleand the chemical properties of the nucleic acids in the sample, e.g. toless than −15° C. and preferably less than −20 or −25° C.

Thus, the sample should be treated so as to minimize the degeneration ordegradation of the nucleic acid, e.g. RNA in the tissue. Such conditionsare well-established in the art and the extent of any degradation may bemonitored through nucleic acid extraction, e.g. total RNA extraction andsubsequent quality analysis at various stages of the preparation of thetissue sample.

In a second representative example, the tissue may be prepared usingstandard methods of formalin-fixation and paraffin-embedding (FFPE),which are well-established in the art. Following fixation of the tissuesample and embedding in a paraffin or resin block, the tissue samplesmay sectioned, i.e. thinly sliced, onto the array. As noted above, otherfixatives and/or embedding materials can be used.

It will be apparent that the tissue sample section will need to betreated to remove the embedding material e.g. to deparaffinize, i.e. toremove the paraffin or resin, from the sample prior to carrying out themethods of the invention. This may be achieved by any suitable methodand the removal of paraffin or resin or other material from tissuesamples is well established in the art, e.g. by incubating the sample(on the surface of the array) in an appropriate solvent e.g. xylene,e.g. twice for 10 minutes, followed by an ethanol rinse, e.g. 99.5%ethanol for 2 minutes, 96% ethanol for 2 minutes, and 70% ethanol for 2minutes.

It will be evident to the skilled person that the RNA in tissue sectionsprepared using methods of FFPE or other methods of fixing and embeddingis more likely to be partially degraded than in the case of frozentissue. However, without wishing to be bound by any particular theory,it is believed that this may be advantageous in the methods of theinvention. For instance, if the RNA in the sample is partially degradedthe average length of the RNA polynucleotides will be less and morerandomized than a non-degraded sample. It is postulated therefore thatpartially degraded RNA would result in less bias in the variousprocessing steps, described elsewhere herein, e.g. ligation of adaptors(amplification domains), amplification of the cDNA molecules andsequencing thereof.

Hence, in one embodiment of the invention the tissue sample, i.e. thesection of the tissue sample contacted with the array, is prepared usingFFPE or other methods of fixing and embedding. In other words the samplemay be fixed, e.g. fixed and embedded. In an alternative embodiment ofthe invention the tissue sample is prepared by deep-freezing. In anotherembodiment a touch imprint of a tissue may be used, according toprocedures known in the art. In other embodiments an unfixed sample maybe used.

The thickness of the tissue sample section for use in the methods of theinvention may be dependent on the method used to prepare the sample andthe physical characteristics of the tissue. Thus, any suitable sectionthickness may be used in the methods of the invention. In representativeembodiments of the invention the thickness of the tissue sample sectionwill be at least 0.1 μm, further preferably at least 0.2, 0.3, 0.4, 0.5,0.7, 1.0, 1.5, 2, 3, 4, 5, 6, 7, 8, 9 or 10 μm. In other embodiments thethickness of the tissue sample section is at least 10, 12, 13, 14, 15,20, 30, 40 or 50 μm. However, the thickness is not critical and theseare representative values only. Thicker samples may be used if desiredor convenient e.g. 70 or 100 μm or more. Typically, the thickness of thetissue sample section is between 1-100 μm, 1-50 μm, 1-30 μm, 1-25 μm,1-20 μm, 1-15 μm, 1-10 μm, 2-8 μm, 3-7 μm or 4-6 μm, but as mentionedabove thicker samples may be used.

On contact of the tissue sample section with the array, e.g. followingremoval of the embedding material e.g. deparrafinization, the nucleicacid, e.g. RNA molecules in the tissue sample will bind to theimmobilized capture probes on the array. In some embodiments it may beadvantageous to facilitate the hybridization of the nucleic acid, e.g.RNA molecules to the capture probes. Typically, facilitating thehybridization comprises modifying the conditions under whichhybridization occurs. The primary conditions that can be modified arethe time and temperature of the incubation of the tissue section on thearray prior to the reverse transcription step, which is describedelsewhere herein.

For instance, on contacting the tissue sample section with the array,the array may be incubated for at least 1 hour to allow the nucleicacid, e.g. RNA to hybridize to the capture probes. Preferably the arraymay be incubated for at least 2, 3, 5, 10, 12, 15, 20, 22 or 24 hours oruntil the tissue sample section has dried. The array incubation time isnot critical and any convenient or desired time may be used. Typicalarray incubations may be up to 72 hours. Thus, the incubation may occurat any suitable temperature, for instance at room temperature, althoughin a preferred embodiment the tissue sample section is incubated on thearray at a temperature of at least 25, 26, 27, 28, 29, 30, 31, 32, 33,34, 35, 36 or 37° C. Incubation temperatures of up to 55° C. arecommonplace in the art. In a particularly preferred embodiment thetissue sample section is allowed to dry on the array at 37° C. for 24hours. Once the tissue sample section has dried the array may be storedat room temperature before performing the reverse transcription step. Itwill be understood that the if the tissue sample section is allowed todry on the surface of the array, it will need to be rehydrated beforefurther manipulation of the captured nucleic acid can be achieved, e.g.the step of reverse transcribing the captured RNA.

Hence, the method of the invention may comprise a further step ofrehydrating the tissue sample after contacting the sample with thearray.

In some embodiments it may be advantageous to block (e.g. mask ormodify) the capture probes prior to contacting the tissue sample withthe array, particularly when the nucleic acid in the tissue sample issubject to a process of modification prior to its capture on the array.Specifically, it may be advantageous to block or modify the free 3′ endof the capture probe. In a particular embodiment, the nucleic acid inthe tissue sample, e.g. fragmented genomic DNA, may be modified suchthat it can be captured by the capture probe. For instance, and asdescribed in more detail below, an adaptor sequence (comprising abinding domain capable of binding to the capture domain of the captureprobe) may be added to the end of the nucleic acid, e.g. fragmentedgenomic DNA. This may be achieved by, e.g. ligation of an adaptor orextension of the nucleic acid, e.g. using an enzyme to incorporateadditional nucleotides at the end of the sequence, e.g. a poly-A tail.It is necessary to block or modify the capture probes, particularly thefree 3′ end of the capture probe, prior to contacting the tissue samplewith the array to avoid modification of the capture probes, e.g. toavoid the addition of a poly-A tail to the free 3′ end of the captureprobes. Preferably the incorporation of a blocking domain may beincorporated into the capture probe when it is synthesised. However, theblocking domain may be incorporated to the capture probe after itssynthesis.

In some embodiments the capture probes may be blocked by any suitableand reversible means that would prevent modification of the capturedomains during the process of modifying the nucleic acid of the tissuesample, which occurs after the tissue sample has been contacted with thearray. In other words, the capture probes may be reversibly masked ormodified such that the capture domain of the capture probe does notcomprise a free 3′ end, i.e. such that the 3′ end is removed ormodified, or made inaccessible so that the capture probe is notsusceptible to the process which is used to modify the nucleic acid ofthe tissue sample, e.g. ligation or extension, or the additionalnucleotides may be removed to reveal and/or restore the 3′ end of thecapture domain of the capture probe.

For example, blocking probes may be hybridised to the capture probes tomask the free 3′ end of the capture domain, e.g. hairpin probes orpartially double stranded probes, suitable examples of which are knownin the art. The free 3′ end of the capture domain may be blocked bychemical modification, e.g. addition of an azidomethyl group as achemically reversible capping moiety such that the capture probes do notcomprise a free 3′ end. Suitable alternative capping moieties are wellknown in the art, e.g. the terminal nucleotide of the capture domaincould be a reversible terminator nucleotide, which could be included inthe capture probe during or after probe synthesis.

Alternatively or additionally, the capture domain of the capture probecould be modified so as to allow the removal of any modifications of thecapture probe, e.g. additional nucleotides, that occur when the nucleicacid molecules of the tissue sample are modified. For instance, thecapture probes may comprise an additional sequence downstream of thecapture domain, i.e. 3′ to capture domain, namely a blocking domain.This could be in the form of, e.g. a restriction endonucleaserecognition sequence or a sequence of nucleotides cleavable by specificenzyme activities, e.g. uracil. Following the modification of thenucleic acid of the tissue sample, the capture probes could be subjectedto an enzymatic cleavage, which would allow the removal of the blockingdomain and any of the additional nucleotides that are added to the 3′end of the capture probe during the modification process. The removal ofthe blocking domain would reveal and/or restore the free 3′ end of thecapture domain of the capture probe. The blocking domain could besynthesised as part of the capture probe or could be added to thecapture probe in situ (i.e. as a modification of an existing array),e.g. by ligation of the blocking domain.

The capture probes may be blocked using any combination of the blockingmechanisms described above.

Once the nucleic acid of the tissue sample, e.g. fragmented genomic DNA,has been modified to enable it to hybridise to the capture domain of thecapture probe, the capture probe must be unblocked, e.g. by dissociationof the blocking oligonucleotide, removal of the capping moiety and/orblocking domain.

In order to correlate the sequence analysis or transcriptome or genomeinformation obtained from each feature of the array with the region(i.e. an area or cell) of the tissue sample the tissue sample isoriented in relation to the features on the array. In other words, thetissue sample is placed on the array such that the position of a captureprobe on the array may be correlated with a position in the tissuesample. Thus it may be identified where in the tissue sample theposition of each species of capture probe (or each feature of the array)corresponds. In other words, it may be identified to which location inthe tissue sample the position of each species of capture probecorresponds. This may be done by virtue of positional markers present onthe array, as described below. Conveniently, but not necessarily, thetissue sample may be imaged following its contact with the array. Thismay be performed before or after the nucleic acid of the tissue sampleis processed, e.g. before or after the cDNA generation step of themethod, in particular the step of generating the first strand cDNA byreverse transcription. In a preferred embodiment the tissue sample isimaged prior to the release of the captured and synthesised (i.e.extended or ligated) DNA, e.g. cDNA, from the array. In a particularlypreferred embodiment the tissue is imaged after the nucleic acid of thetissue sample has been processed, e.g. after the reverse transcriptionstep, and any residual tissue is removed (e.g. washed) from the arrayprior to the release of molecules, e.g. of the cDNA from the array. Insome embodiments, the step of processing the captured nucleic acid, e.g.the reverse transcription step, may act to remove residual tissue fromthe array surface, e.g. when using tissue preparing by deep-freezing. Insuch a case, imaging of the tissue sample may take place prior to theprocessing step, e.g. the cDNA synthesis step. Generally speaking,imaging may take place at any time after contacting the tissue samplewith the area, but before any step which degrades or removes the tissuesample. As noted above, this may depend on the tissue sample.

Advantageously, the array may comprise markers to facilitate theorientation of the tissue sample or the image thereof in relation to thefeatures of the array. Any suitable means for marking the array may beused such that they are detectable when the tissue sample is imaged. Forinstance, a molecule, e.g. a fluorescent molecule, that generates asignal, preferably a visible signal, may be immobilized directly orindirectly on the surface of the array. Preferably, the array comprisesat least two markers in distinct positions on the surface of the array,further preferably at least 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 30, 40,50, 60, 70, 80, 90 or 100 markers. Conveniently several hundred or evenseveral thousand markers may be used. The markers may be provided in apattern, for example make up an outer edge of the array, e.g. an entireouter row of the features of an array. Other informative patterns may beused, e.g. lines sectioning the array. This may facilitate aligning animage of the tissue sample to an array, or indeed generally incorrelating the features of the array to the tissue sample. Thus, themarker may be an immobilized molecule to which a signal giving moleculemay interact to generate a signal. In a representative example, thearray may comprise a marker feature, e.g. a nucleic acid probeimmobilized on the substrate of array, to which a labelled nucleic acidmay hybridize. For instance, the labelled nucleic acid molecule, ormarker nucleic acid, may be linked or coupled to a chemical moietycapable of fluorescing when subjected to light of a specific wavelength(or range of wavelengths), i.e. excited. Such a marker nucleic acidmolecule may be contacted with the array before, contemporaneously withor after the tissue sample is stained in order to visualize or image thetissue sample. However, the marker must be detectable when the tissuesample is imaged. Thus, in a preferred embodiment the marker may bedetected using the same imaging conditions used to visualize the tissuesample.

In a particularly preferred embodiment of the invention, the arraycomprises marker features to which a labelled, preferably fluorescentlylabelled, marker nucleic acid molecule, e.g. oligonucleotide, ishybridized.

The step of imaging the tissue may use any convenient histological meansknown in the art, e.g. light, bright field, dark field, phase contrast,fluorescence, reflection, interference, confocal microscopy or acombination thereof. Typically the tissue sample is stained prior tovisualization to provide contrast between the different regions, e.g.cells, of the tissue sample. The type of stain used will be dependent onthe type of tissue and the region of the cells to be stained. Suchstaining protocols are known in the art. In some embodiments more thanone stain may be used to visualize (image) different aspects of thetissue sample, e.g. different regions of the tissue sample, specificcell structures (e.g. organelles) or different cell types. In otherembodiments, the tissue sample may be visualized or imaged withoutstaining the sample, e.g. if the tissue sample contains already pigmentsthat provide sufficient contrast or if particular forms of microscopyare used.

In a preferred embodiment, the tissue sample is visualized or imagedusing fluorescence microscopy.

The tissue sample, i.e. any residual tissue that remains in contact withthe array substrate following the reverse transcription step andoptionally imaging, if imaging is desired and was not carried out beforereverse transcription, preferably is removed prior to the step ofreleasing the cDNA molecules from the array. Thus, the methods of theinvention may comprise a step of washing the array. Removal of theresidual tissue sample may be performed using any suitable means andwill be dependent on the tissue sample. In the simplest embodiment, thearray may be washed with water. The water may contain various additives,e.g. surfactants (e.g. detergents), enzymes etc to facilitate to removalof the tissue. In some embodiments, the array is washed with a solutioncomprising a proteinase enzyme (and suitable buffer) e.g. proteinase K.In other embodiments, the solution may comprise also or alternativelycellulase, hemicelluase or chitinase enzymes, e.g. if the tissue sampleis from a plant or fungal source. In further embodiments, thetemperature of the solution used to wash the array may be, e.g. at least30° C., preferably at least 35, 40, 45, 50 or 55° C. It will be evidentthat the wash solution should minimize the disruption of the immobilizednucleic acid molecules. For instance, in some embodiments the nucleicacid molecules may be immobilized on the substrate of the arrayindirectly, e.g. via hybridization of the capture probe and the RNAand/or the capture probe and the surface probe, thus the wash stepshould not interfere with the interaction between the moleculesimmobilized on the array, i.e. should not cause the nucleic acidmolecules to be denatured.

Following the step of contacting the array with a tissue sample, underconditions sufficient to allow hybridization to occur between thenucleic acid, e.g. RNA (preferably mRNA), of the tissue sample to thecapture probes, the step of securing (acquiring) the hybridized nucleicacid takes place. Securing or acquiring the captured nucleic acidinvolves a covalent attachment of a complementary strand of thehybridized nucleic acid to the capture probe (i.e. via a nucleotidebond, a phosphodiester bond between juxtaposed 3′-hydroxyl and5′-phosphate termini of two immediately adjacent nucleotides), therebytagging or marking the captured nucleic acid with the positional domainspecific to the feature on which the nucleic acid is captured.

In some embodiments, securing the hybridized nucleic acid, e.g. a singlestranded nucleic acid, may involve extending the capture probe toproduce a copy of the captured nucleic acid, e.g. generating cDNA fromthe captured (hybridized) RNA. It will be understood that this refers tothe synthesis of a complementary strand of the hybridized nucleic acid,e.g. generating cDNA based on the captured RNA template (the RNAhybridized to the capture domain of the capture probe). Thus, in aninitial step of extending the capture probe, e.g. the cDNA generation,the captured (hybridized) nucleic acid, e.g. RNA acts as a template forthe extension, e.g. reverse transcription, step. In other embodiments,as described below, securing the hybridized nucleic acid, e.g. partiallydouble stranded DNA, may involve covalently coupling the hybridizednucleic acid, e.g. fragmented DNA, to the capture probe, e.g. ligatingto the capture probe the complementary strand of the nucleic acidhybridized to the capture probe, in a ligation reaction.

Reverse transcription concerns the step of synthesizing cDNA(complementary or copy DNA) from RNA, preferably mRNA (messenger RNA),by reverse transcriptase. Thus cDNA can be considered to be a copy ofthe RNA present in a cell at the time at which the tissue sample wastaken, i.e. it represents all or some of the genes that were expressedin said cell at the time of isolation.

The capture probe, specifically the capture domain of the capture probe,acts as a primer for producing the complementary strand of the nucleicacid hybridized to the capture probe, e.g. a primer for reversetranscription. Hence, the nucleic acid, e.g. cDNA, molecules generatedby the extension reaction, e.g. reverse transcription reaction,incorporate the sequence of the capture probe, i.e. the extensionreaction, e.g. reverse transcription reaction, may be seen as a way oflabelling indirectly the nucleic acid, e.g. transcripts, of the tissuesample that are in contact with each feature of the array. As mentionedabove, each species of capture probe comprises a positional domain(feature identification tag) that represents a unique sequence for eachfeature of the array. Thus, all of the nucleic acid, e.g. cDNA,molecules synthesized at a specific feature will comprise the samenucleic acid “tag”.

The nucleic acid, e.g. cDNA, molecules synthesized at each feature ofthe array may represent the genome of, or genes expressed from, theregion or area of the tissue sample in contact with that feature, e.g. atissue or cell type or group or sub-group thereof, and may furtherrepresent genes expressed under specific conditions, e.g. at aparticular time, in a specific environment, at a stage of development orin response to stimulus etc. Hence, the cDNA at any single feature mayrepresent the genes expressed in a single cell, or if the feature is incontact with the sample at a cell junction, the cDNA may represent thegenes expressed in more than one cell. Similarly, if a single cell is incontact with multiple features, then each feature may represent aproportion of the genes expressed in said cell. Similarly, inembodiments in which the captured nucleic acid is DNA, any singlefeature may be representative of the genome of a single cell or morethan one cell. Alternatively, the genome of a single cell may berepresented by multiple features.

The step of extending the capture probe, e.g. reverse transcription, maybe performed using any suitable enzymes and protocol of which many existin the art, as described in detail below. However, it will be evidentthat it is not necessary to provide a primer for the synthesis of thefirst nucleic acid, e.g. cDNA, strand because the capture domain of thecapture probe acts as the primer, e.g. reverse transcription primer.

Preferably, in the context of the present invention the secured nucleicacid (i.e. the nucleic acid covalently attached to the capture probe),e.g. cDNA is treated to comprise double stranded DNA. However, in someembodiments, the captured DNA may already comprise double stranded DNA,e.g. where partially double stranded fragmented DNA is ligated to thecapture probe. Treatment of the captured nucleic acid to produce doublestranded DNA may be achieved in a single reaction to generate only asecond DNA, e.g. cDNA, strand, i.e. to produce double stranded DNAmolecules without increasing the number of double stranded DNAmolecules, or in an amplification reaction to generate multiple copiesof the second strand, which may be in the form of single stranded DNA(e.g. linear amplification) or double stranded DNA, e.g. cDNA (e.g.exponential amplification).

The step of second strand DNA, e.g. cDNA, synthesis may take place insitu on the array, either as a discrete step of second strand synthesis,for example using random primers as described in more detail below, orin the initial step of an amplification reaction. Alternatively, thefirst strand DNA, e.g. cDNA (the strand comprising, i.e. incorporating,the capture probe) may be released from the array and second strandsynthesis, whether as a discrete step or in an amplification reactionmay occur subsequently, e.g. in a reaction carried out in solution.

Where second strand synthesis takes place on the array (i.e. in situ)the method may include an optional step of removing the captured nucleicacid, e.g.

RNA before the second strand synthesis, for example using an RNAdigesting enzyme (RNase) e.g. RNase H. Procedures for this are wellknown and described in the art. However, this is generally notnecessary, and in most cases the RNA degrades naturally. Removal of thetissue sample from the array will generally remove the RNA from thearray. RNase H can be used if desired to increase the robustness of RNAremoval.

For instance, in tissue samples that comprise large amounts of RNA, thestep of generating the double stranded cDNA may yield a sufficientamount of cDNA that it may be sequenced directly (following release fromthe array). In this case, second strand cDNA synthesis may be achievedby any means known in the art and as described below. The second strandsynthesis reaction may be performed on the array directly, i.e. whilstthe cDNA is immobilized on the array, or preferably after the cDNA hasbeen released from the array substrate, as described below.

In other embodiments it will be necessary to enhance, i.e. amplify, theamount of secured nucleic acid, e.g. synthesized cDNA to yieldquantities that are sufficient for DNA sequencing. In this embodiment,the first strand of the secured nucleic acid, e.g. cDNA molecules, whichcomprise also the capture probe of the features of the array, acts as atemplate for the amplification reaction, e.g. a polymerase chainreaction. The first reaction product of the amplification will be asecond strand of DNA, e.g. cDNA, which itself will act as a template forfurther cycles of the amplification reaction.

In either of the above described embodiments, the second strand of DNA,e.g. cDNA, will comprise a complement of the capture probe. If thecapture probe comprises a universal domain, and particularly anamplification domain within the universal domain, then this may be usedfor the subsequent amplification of the DNA, e.g. cDNA, e.g. theamplification reaction may comprise a primer with the same sequence asthe amplification domain, i.e. a primer that is complementary (i.e.hybridizes) to the complement of the amplification domain. In view ofthe fact that the amplification domain is upstream of the positionaldomain of the capture probe (in the secured nucleic acid, e.g. the firstcDNA strand), the complement of the positional domain will beincorporated in the second strand of the DNA, e.g. cDNA molecules.

In embodiments where the second strand of DNA, e.g. cDNA is generated ina single reaction, the second strand synthesis may be achieved by anysuitable means. For instance, the first strand cDNA, preferably, but notnecessarily, released from the array substrate, may be incubated withrandom primers, e.g. hexamer primers, and a DNA polymerase, preferably astrand displacement polymerase, e.g. klenow (exo), under conditionssufficient for templated DNA synthesis to occur. This process will yielddouble stranded cDNA molecules of varying lengths and is unlikely toyield full-length cDNA molecules, i.e. cDNA molecules that correspond toentire mRNA from which they were synthesized. The random primers willhybridise to the first strand cDNA molecules at a random position, i.e.within the sequence rather than at the end of the sequence.

If it is desirable to generate full-length DNA, e.g. cDNA, molecules,i.e. molecules that correspond to the whole of the captured nucleicacid, e.g. RNA molecule (if the nucleic acid, e.g. RNA, was partiallydegraded in the tissue sample then the captured nucleic acid, e.g. RNA,molecules will not be “full-length” transcripts or the same length asthe initial fragments of genomic DNA), then the 3′ end of the securednucleic acid, e.g. first stand cDNA, molecules may be modified.

For example, a linker or adaptor may be ligated to the 3′ end of thecDNA molecules. This may be achieved using single stranded ligationenzymes such as T4 RNA ligase or Circligase™ (EpicentreBiotechnologies).

Alternatively, a helper probe (a partially double stranded DNA moleculecapable of hybridising to the 3′ end of the first strand cDNA molecule),may be ligated to the 3′ end of the secured nucleic acid, e.g. firststrand cDNA, molecule using a double stranded ligation enzyme such as T4DNA ligase. Other enzymes appropriate for the ligation step are known inthe art and include, e.g. Tth DNA ligase, Taq DNA ligase, Thermococcussp. (strain 90N) DNA ligase (9° N™ DNA ligase, New England Biolabs), andAmpligase™ (Epicentre Biotechnologies). The helper probe comprises alsoa specific sequence from which the second strand DNA, e.g. cDNA,synthesis may be primed using a primer that is complementary to the partof the helper probe that is ligated to the secured nucleic acid, e.g.first cDNA strand. A further alternative comprises the use of a terminaltransferase active enzyme to incorporate a polynucleotide tail, e.g. apoly-A tail, at the 3′ end of the secured nucleic acid, e.g. firststrand of cDNA, molecules. The second strand synthesis may be primedusing a poly-T primer, which may also comprise a specific amplificationdomain for further amplification. Other methods for generating“full-length” double stranded DNA, e.g. cDNA, molecules (or maximallength second strand synthesis) are well-established in the art.

In some embodiments, second strand synthesis may use a method oftemplate switching, e.g. using the SMART™ technology from Clontech®.SMART (Switching Mechanism at 5′ End of RNA Template) technology is wellestablished in the art and is based that the discovery that reversetranscriptase enzymes, e.g. Superscript® II (Invitrogen), are capable ofadding a few nucleotides at the 3′ end of an extended cDNA molecule,i.e. to produce a DNA/RNA hybrid with a single stranded DNA overhang atthe 3′ end. The DNA overhang may provide a target sequence to which anoligonucleotide probe can hybridise to provide an additional templatefor further extension of the cDNA molecule. Advantageously, theoligonucleotide probe that hybridises to the cDNA overhang contains anamplification domain sequence, the complement of which is incorporatedinto the synthesised first strand cDNA product. Primers containing theamplification domain sequence, which will hybridise to the complementaryamplification domain sequence incorporated into the cDNA first strand,can be added to the reaction mix to prime second strand synthesis usinga suitable polymerase enzyme and the cDNA first strand as a template.This method avoids the need to ligate adaptors to the 3′ end of the cDNAfirst strand. Whilst template switching was originally developed forfull-length mRNAs, which have a 5′ cap structure, it has since beendemonstrated to work equally well with truncated mRNAs without the capstructure. Thus, template switching may be used in the methods of theinvention to generate full length and/or partial or truncated cDNAmolecules. Thus, in a preferred embodiment of the invention, the secondstrand synthesis may utilise, or be achieved by, template switching. Ina particularly preferred embodiment, the template switching reaction,i.e. the further extension of the cDNA first strand to incorporate thecomplementary amplification domain, is performed in situ (whilst thecapture probe is still attached, directly or indirectly, to the array).Preferably, the second strand synthesis reaction is also performed insitu.

In embodiments where it may be necessary or advantageous to enhance,enrich or amplify the DNA, e.g. cDNA molecules, amplification domainsmay be incorporated in the DNA, e.g. cDNA molecules. As discussed above,a first amplification domain may be incorporated into the securednucleic acid molecules, e.g. the first strand of the cDNA molecules,when the capture probe comprises a universal domain comprising anamplification domain. In these embodiments, the second strand synthesismay incorporate a second amplification domain. For example, the primersused to generate the second strand cDNA, e.g. random hexamer primers,poly-T primer, the primer that is complementary to the helper probe, maycomprise at their 5′ end an amplification domain, i.e. a nucleotidesequence to which an amplification primer may hybridize. Thus, theresultant double stranded DNA may comprise an amplification domain at ortowards each 5′ end of the double stranded DNA, e.g. cDNA molecules.These amplification domains may be used as targets for primers used inan amplification reaction, e.g. PCR. Alternatively, the linker oradaptor which is ligated to the 3′ end of the secured nucleic acidmolecules, e.g. first strand cDNA molecules, may comprise a seconduniversal domain comprising a second amplification domain. Similarly, asecond amplification domain may be incorporated into the first strandcDNA molecules by template switching.

In embodiments where the capture probe does not comprise a universaldomain, particularly comprising an amplification domain, the secondstrand of the cDNA molecules may be synthesised in accordance with theabove description. The resultant double stranded DNA molecules may bemodified to incorporate an amplification domain at the 5′ end of thefirst DNA, e.g. cDNA strand (a first amplification domain) and, if notincorporated in the second strand DNA, e.g. cDNA synthesis step, at the5′ end of the second DNA, e.g. cDNA strand (a second amplificationdomain). Such amplification domains may be incorporated, e.g. byligating double stranded adaptors to the ends of the DNA, e.g. cDNAmolecules. Enzymes appropriate for the ligation step are known in theart and include, e.g. Tth DNA ligase, Taq DNA ligase, Thermococcus sp.(strain 90N) DNA ligase (9° N™ DNA ligase, New England Biolabs),Ampligase™ (Epicentre Biotechnologies) and T4 DNA ligase. In a preferredembodiment the first and second amplification domains comprise differentsequences.

From the above, it is therefore apparent that universal domains, whichmay comprise an amplification domain, may be added to the secured (i.e.extended or ligated) DNA molecules, for example to the cDNA molecules,or their complements (e.g. second strand) by various methods andtechniques and combinations of such techniques known in the art e.g. byuse of primers which include such a domain, ligation of adaptors, use ofterminal transferase enzymes and/or by template switching methods. As isclear from the discussion herein, such domains may be added before orafter release of the DNA molecules from the array.

It will be apparent from the above description that all of the DNA, e.g.cDNA molecules from a single array that have been synthesized by themethods of the invention may all comprise the same first and secondamplification domains.

Consequently, a single amplification reaction, e.g. PCR, may besufficient to amplify all of the DNA, e.g. cDNA molecules. Thus in apreferred embodiment, the method of the invention may comprise a step ofamplifying the DNA, e.g. cDNA molecules. In one embodiment theamplification step is performed after the release of the DNA, e.g. cDNAmolecules from the substrate of the array. In other embodimentsamplification may be performed on the array (i.e. in situ on the array).It is known in the art that amplification reactions may be carried outon arrays and on-chip thermocyclers exist for carrying out suchreactions. Thus, in one embodiment arrays which are known in the art assequencing platforms or for use in any form of sequence analysis (e.g.in or by next generation sequencing technologies) may be used as thebasis of the arrays of the present invention (e.g. Illumina bead arraysetc.)

For the synthesis of the second strand of DNA, e.g. cDNA it ispreferable to use a strand displacement polymerase (e.g. 029 DNApolymerase, Bst (exo-) DNA polymerase, klenow (exo-) DNA polymerase) ifthe cDNA released from the substrate of the array comprises a partiallydouble stranded nucleic acid molecule. For instance, the releasednucleic acids will be at least partially double stranded (e.g. DNA:DNA,DNA:RNA or DNA:DNA/RNA hybrid) in embodiments where the capture probe isimmobilized indirectly on the substrate of the array via a surface probeand the step of releasing the DNA, e.g. cDNA molecules comprises acleavage step. The strand displacement polymerase is necessary to ensurethat the second cDNA strand synthesis incorporates the complement of thepositional domain (feature identification domain) into the second DNA,e.g. cDNA strand.

It will be evident that the step of releasing at least part of the DNA,e.g. cDNA molecules or their amplicons from the surface or substrate ofthe array may be achieved using a number of methods. The primary aim ofthe release step is to yield molecules into which the positional domainof the capture probe (or its complement) is incorporated (or included),such that the DNA, e.g. cDNA molecules or their amplicons are “tagged”according to their feature (or position) on the array.

The release step thus removes DNA, e.g. cDNA molecules or ampliconsthereof from the array, which DNA, e.g. cDNA molecules or ampliconsinclude the positional domain or its complement (by virtue of it havingbeen incorporated into the secured nucleic acid, e.g. the first strandcDNA by, e.g. extension of the capture probe, and optionally copied inthe second strand DNA if second strand synthesis takes place on thearray, or copied into amplicons if amplification takes place on thearray). Hence, in order to yield sequence analysis data that can becorrelated with the various regions in the tissue sample it is essentialthat the released molecules comprise the positional domain of thecapture probe (or its complement).

Since the released molecule may be a first and/or second strand DNA,e.g. cDNA molecule or amplicon, and since the capture probe may beimmobilised indirectly on the array, it will be understood that whilstthe release step may comprise a step of cleaving a DNA, e.g. cDNAmolecule from the array, the release step does not require a step ofnucleic acid cleavage; a DNA, e.g. cDNA molecule or an amplicon maysimply be released by denaturing a double-stranded molecule, for examplereleasing the second cDNA strand from the first cDNA strand, orreleasing an amplicon from its template or releasing the first strandcDNA molecule (i.e. the extended capture probe) from a surface probe.Accordingly, a DNA, e.g. cDNA molecule may be released from the array bynucleic acid cleavage and/or by denaturation (e.g. by heating todenature a double-stranded molecule). Where amplification is carried outin situ on the array, this will of course encompass releasing ampliconsby denaturation in the cycling reaction.

In some embodiments, the DNA, e.g. cDNA molecules are released byenzymatic cleavage of a cleavage domain, which may be located in theuniversal domain or positional domain of the capture probe. As mentionedabove, the cleavage domain must be located upstream (at the 5′ end) ofthe positional domain, such that the released DNA, e.g. cDNA moleculescomprise the positional (identification) domain. Suitable enzymes fornucleic acid cleavage include restriction endonucleases, e.g. Rsal.Other enzymes, e.g. a mixture of Uracil DNA glycosylase (UDG) and theDNA glycosylase-lyase Endonuclease VIII (USER™ enzyme) or a combinationof the MutY and T7 endonuclease I enzymes, are preferred embodiments ofthe methods of the invention.

In an alternative embodiment, the DNA, e.g. cDNA molecules may bereleased from the surface or substrate of the array by physical means.For instance, in embodiments where the capture probe is indirectlyimmobilized on the substrate of the array, e.g. via hybridization to thesurface probe, it may be sufficient to disrupt the interaction betweenthe nucleic acid molecules. Methods for disrupting the interactionbetween nucleic acid molecules, e.g. denaturing double stranded nucleicacid molecules, are well known in the art. A straightforward method forreleasing the DNA, e.g. cDNA molecules (i.e. of stripping the array ofthe synthesized DNA, e.g. cDNA molecules) is to use a solution thatinterferes with the hydrogen bonds of the double stranded molecules. Ina preferred embodiment of the invention, the DNA, e.g. cDNA moleculesmay be released by applying heated water, e.g. water or buffer of atleast 85° C., preferably at least 90, 91, 92, 93, 94, 95, 96, 97, 98,99° C. As an alternative or addition to the use of a temperaturesufficient to disrupt the hydrogen bonding, the solution may comprisesalts, surfactants etc. that may further destabilize the interactionbetween the nucleic acid molecules, resulting in the release of the DNA,e.g. cDNA molecules.

It will be understood that the application of a high temperaturesolution, e.g. 90-99° C. water may be sufficient to disrupt a covalentbond used to immobilize the capture probe or surface probe to the arraysubstrate. Hence, in a preferred embodiment, the DNA, e.g. cDNAmolecules may be released by applying hot water to the array to disruptcovalently immobilized capture or surface probes.

It is implicit that the released DNA, e.g. cDNA molecules (the solutioncomprising the released DNA, e.g. cDNA molecules) are collected forfurther manipulation, e.g. second strand synthesis and/or amplification.Nevertheless, the method of the invention may be seen to comprise a stepof collecting or recovering the released DNA, e.g. cDNA molecules. Asnoted above, in the context of in situ amplification the releasedmolecules may include amplicons of the secured nucleic acid, e.g. cDNA.

In embodiments of methods of the invention, it may be desirable toremove any unextended or unligated capture probes. This may be, forexample, after the step of releasing DNA molecules from the array. Anydesired or convenient method may be used for such removal including, forexample, use of an enzyme to degrade the unextended or unligated probes,e.g. exonuclease.

The DNA, e.g. cDNA molecules, or amplicons, that have been released fromthe array, which may have been modified as discussed above, are analysedto investigate (e.g. determine their sequence, although as noted aboveactual sequence determination is not required—any method of analysingthe sequence may be used). Thus, any method of nucleic acid analysis maybe used. The step of sequence analysis may identify the positionaldomain and hence allow the analysed molecule to be localised to aposition in the tissue sample. Similarly, the nature or identity of theanalysed molecule may be determined. In this way the nucleic acid, e.g.RNA at given position in the array, and hence in the tissue sample maybe determined. Hence the analysis step may include or use any methodwhich identifies the analysed molecule (and hence the “target” molecule)and its positional domain. Generally such a method will be asequence-specific method. For example, the method may usesequence-specific primers or probes, particularly primers or probesspecific for the positional domain and/or for a specific nucleic acidmolecule to be detected or analysed e.g. a DNA molecule corresponding toa nucleic acid, e.g. RNA or cDNA molecule to be detected. Typically insuch a method sequence-specific amplification primers e.g. PCR primersmay be used.

In some embodiments it may be desirable to analyse a subset or family oftarget related molecules, e.g. all of the sequences that encode aparticular group of proteins which share sequence similarity and/orconserved domains, e.g. a family of receptors. Hence, the amplificationand/or analysis methods described herein may use degenerate or genefamily specific primers or probes that hybridise to a subset of thecaptured nucleic acids or nucleic acids derived therefrom, e.g.amplicons. In a particularly preferred embodiment, the amplificationand/or analysis methods may utilise a universal primer (i.e. a primercommon to all of the captured sequences) in combination with adegenerate or gene family specific primer specific for a subset oftarget molecules.

Thus in one embodiment, amplification-based, especially PCR-basedmethods of sequence analysis are used.

However, the steps of modifying and/or amplifying the released DNA, e.g.cDNA molecules may introduce additional components into the sample, e.g.enzymes, primers, nucleotides etc. Hence, the methods of the inventionmay further comprise a step of purifying the sample comprising thereleased DNA, e.g. cDNA molecules or amplicons prior to the sequenceanalysis, e.g. to remove oligonucleotide primers, nucleotides, salts etcthat may interfere with the sequencing reactions. Any suitable method ofpurifying the DNA, e.g. cDNA molecules may be used.

As noted above, sequence analysis of the released DNA molecules may bedirect or indirect. Thus the sequence analysis substrate (which may beviewed as the molecule which is subjected to the sequence analysis stepor process) may directly be the molecule which is released from thearray or it may be a molecule which is derived therefrom. Thus, forexample in the context of sequence analysis step which involves asequencing reaction, the sequencing template may be the molecule whichis released from the array or it may be a molecule derived therefrom.For example, a first and/or second strand DNA, e.g. cDNA moleculereleased from the array may be directly subjected to sequence analysis(e.g. sequencing), i.e. may directly take part in the sequence analysisreaction or process (e.g. the sequencing reaction or sequencing process,or be the molecule which is sequenced or otherwise identified). In thecontext of in situ amplification the released molecule may be anamplicon. Alternatively, the released molecule may be subjected to astep of second strand synthesis or amplification before sequenceanalysis (e.g. sequencing or identification by other means). Thesequence analysis substrate (e.g. template) may thus be an amplicon or asecond strand of a molecule which is directly released from the array.

Both strands of a double stranded molecule may be subjected to sequenceanalysis (e.g. sequenced) but the invention is not limited to this andsingle stranded molecules (e.g. cDNA) may be analysed (e.g. sequenced).For example various sequencing technologies may be used for singlemolecule sequencing, e.g. the Helicos or Pacbio technologies, ornanopore sequencing technologies which are being developed. Thus, in oneembodiment the first strand of DNA, e.g. cDNA may be subjected tosequencing. The first strand DNA, e.g. cDNA may need to be modified atthe 3′ end to enable single molecule sequencing. This may be done byprocedures analogous to those for handling the second DNA, e.g. cDNAstrand. Such procedures are known in the art.

In a preferred aspect of the invention the sequence analysis willidentify or reveal a portion of captured nucleic acid, e.g. RNA sequenceand the sequence of the positional domain. The sequence of thepositional domain (or tag) will identify the feature to which thenucleic acid, e.g. mRNA molecule was captured. The sequence of thecaptured nucleic acid, e.g. RNA molecule may be compared with a sequencedatabase of the organism from which the sample originated to determinethe gene to which it corresponds. By determining which region (e.g.cell) of the tissue sample was in contact with the feature, it ispossible to determine which region of the tissue sample was expressingsaid gene (or contained the gene, e.g. in the case of spatial genomics).This analysis may be achieved for all of the DNA, e.g. cDNA moleculesgenerated by the methods of the invention, yielding a spatialtranscriptome or genome of the tissue sample.

By way of a representative example, sequencing data may be analysed tosort the sequences into specific species of capture probe, i.e.according to the sequence of the positional domain. This may be achievedby, e.g. using the FastX toolkit FASTQ Barcode splitter tool to sort thesequences into individual files for the respective capture probepositional domain (tag) sequences. The sequences of each species, i.e.from each feature, may be analyzed to determine the identity of thetranscripts. For instance, the sequences may be identified using e.g.Blastn software, to compare the sequences to one or more genomedatabases, preferably the database for the organism from which thetissue sample was obtained. The identity of the database sequence withthe greatest similarity to the sequence generated by the methods of theinvention will be assigned to said sequence. In general, only hits witha certainty of at least 1 e⁻⁴, preferably 1 e⁻⁷, 1 e⁻⁴, or 11 e⁻⁹ willbe considered to have been successfully identified.

It will be apparent that any nucleic acid sequencing method may beutilised in the methods of the invention. However, the so-called “nextgeneration sequencing” techniques will find particular utility in thepresent invention. High-throughput sequencing is particularly useful inthe methods of the invention because it enables a large number ofnucleic acids to be partially sequenced in a very short period of time.In view of the recent explosion in the number of fully or partiallysequenced genomes, it is not essential to sequence the full length ofthe generated DNA, e.g. cDNA molecules to determine the gene to whicheach molecule corresponds. For example, the first 100 nucleotides fromeach end of the DNA, e.g. cDNA molecules should be sufficient toidentify both the feature to which the nucleic acid, e.g. mRNA wascaptured (i.e. its location on the array) and the gene expressed. Thesequence reaction from the “capture probe end” of the DNA, e.g. cDNAmolecules yields the sequence of the positional domain and at leastabout 20 bases, preferably 30 or 40 bases of transcript specificsequence data. The sequence reaction from the “non-capture probe end”may yield at least about 70 bases, preferably 80, 90, or 100 bases oftranscript specific sequence data.

As a representative example, the sequencing reaction may be based onreversible dye-terminators, such as used in the Illumina™ technology.For example, DNA molecules are first attached to primers on, e.g. aglass or silicon slide and amplified so that local clonal colonies areformed (bridge amplification). Four types of ddNTPs are added, andnon-incorporated nucleotides are washed away. Unlike pyrosequencing, theDNA can only be extended one nucleotide at a time. A camera takes imagesof the fluorescently labelled nucleotides then the dye along with theterminal 3′ blocker is chemically removed from the DNA, allowing a nextcycle. This may be repeated until the required sequence data isobtained. Using this technology, thousands of nucleic acids may besequenced simultaneously on a single slide.

Other high-throughput sequencing techniques may be equally suitable forthe methods of the invention, e.g. pyrosequencing. In this method theDNA is amplified inside water droplets in an oil solution (emulsionPCR), with each droplet containing a single DNA template attached to asingle primer-coated bead that then forms a clonal colony. Thesequencing machine contains many picolitre-volume wells each containinga single bead and sequencing enzymes. Pyrosequencing uses luciferase togenerate light for detection of the individual nucleotides added to thenascent DNA and the combined data are used to generate sequenceread-outs.

An example of a technology in development is based on the detection ofhydrogen ions that are released during the polymerisation of DNA. Amicrowell containing a template DNA strand to be sequenced is floodedwith a single type of nucleotide. If the introduced nucleotide iscomplementary to the leading template nucleotide it is incorporated intothe growing complementary strand. This causes the release of a hydrogenion that triggers a hypersensitive ion sensor, which indicates that areaction has occurred. If homopolymer repeats are present in thetemplate sequence multiple nucleotides will be incorporated in a singlecycle. This leads to a corresponding number of released hydrogen ionsand a proportionally higher electronic signal.

Thus, it is clear that future sequencing formats are slowly being madeavailable, and with shorter run times as one of the main features ofthose platforms it will be evident that other sequencing technologieswill be useful in the methods of the invention.

An essential feature of the present invention, as described above, is astep of securing a complementary strand of the captured nucleic acidmolecules to the capture probe, e.g. reverse transcribing the capturedRNA molecules. The reverse transcription reaction is well known in theart and in representative reverse transcription reactions, the reactionmixture includes a reverse transcriptase, dNTPs and a suitable buffer.The reaction mixture may comprise other components, e.g. RNaseinhibitor(s). The primers and template are the capture domain of thecapture probe and the captured RNA molecules are described above.

In the subject methods, each dNTP will typically be present in an amountranging from about 10 to 5000 μM, usually from about 20 to 1000 μM. Itwill be evident that an equivalent reaction may be performed to generatea complementary strand of a captured DNA molecule, using an enzyme withDNA polymerase activity. Reactions of this type are well known in theart and are described in more detail below.

The desired reverse transcriptase activity may be provided by one ormore distinct enzymes, wherein suitable examples are: M-MLV, MuLV, AMV,HIV, ArrayScript™, MultiScribe™, ThermoScript™, and SuperScript® I, II,and III enzymes.

The reverse transcriptase reaction may be carried out at any suitabletemperature, which will be dependent on the properties of the enzyme.Typically, reverse transcriptase reactions are performed between 37-55°C., although temperatures outside of this range may also be appropriate.The reaction time may be as little as 1, 2, 3, 4 or 5 minutes or as muchas 48 hours. Typically the reaction will be carried out for between5-120 minutes, preferably 5-60, 5-45 or 5-30 minutes or 1-10 or 1-5minutes according to choice. The reaction time is not critical and anydesired reaction time may be used.

As indicated above, certain embodiments of the methods include anamplification step, where the copy number of generated DNA, e.g. cDNAmolecules is increased, e.g., in order to enrich the sample to obtain abetter representation of the nucleic acids, e.g. transcripts capturedfrom the tissue sample. The amplification may be linear or exponential,as desired, where representative amplification protocols of interestinclude, but are not limited to: polymerase chain reaction (PCR);isothermal amplification, etc.

The polymerase chain reaction (PCR) is well known in the art, beingdescribed in U.S. Pat. Nos. 4,683,202; 4,683,195; 4,800,159; 4,965,188and 5,512,462, the disclosures of which are herein incorporated byreference. In representative PCR amplification reactions, the reactionmixture that includes the above released DNA, e.g. cDNA molecules fromthe array, which are combined with one or more primers that are employedin the primer extension reaction, e.g., the PCR primers that hybridizeto the first and/or second amplification domains (such as forward andreverse primers employed in geometric (or exponential) amplification ora single primer employed in a linear amplification). The oligonucleotideprimers with which the released DNA, e.g. cDNA molecules (hereinafterreferred to as template DNA for convenience) is contacted will be ofsufficient length to provide for hybridization to complementary templateDNA under annealing conditions (described in greater detail below). Thelength of the primers will depend on the length of the amplificationdomains, but will generally be at least 10 bp in length, usually atleast 15 bp in length and more usually at least 16 bp in length and maybe as long as 30 bp in length or longer, where the length of the primerswill generally range from 18 to 50 bp in length, usually from about 20to 35 bp in length. The template DNA may be contacted with a singleprimer or a set of two primers (forward and reverse primers), dependingon whether primer extension, linear or exponential amplification of thetemplate DNA is desired.

In addition to the above components, the reaction mixture produced inthe subject methods typically includes a polymerase anddeoxyribonucleoside triphosphates (dNTPs). The desired polymeraseactivity may be provided by one or more distinct polymerase enzymes. Inmany embodiments, the reaction mixture includes at least a Family Apolymerase, where representative Family A polymerases of interestinclude, but are not limited to: Thermus aquaticus polymerases,including the naturally occurring polymerase (Taq) and derivatives andhomologues thereof, such as Klentaq (as described in Barnes et al, Proc.Natl. Acad. Sci USA (1994) 91:2216-2220); Thermus thermophiluspolymerases, including the naturally occurring polymerase (Tth) andderivatives and homologues thereof, and the like. In certain embodimentswhere the amplification reaction that is carried out is a high fidelityreaction, the reaction mixture may further include a polymerase enzymehaving 3′-5′ exonuclease activity, e.g., as may be provided by a FamilyB polymerase, where Family B polymerases of interest include, but arenot limited to: Thermococcus litoralis DNA polymerase (Vent) asdescribed in Perler et al., Proc. Natl. Acad. Sci. USA (1992)89:5577-5581; Pyrococcus species GB-D (Deep Vent); Pyrococcus furiosusDNA polymerase (Pfu) as described in Lundberg et al., Gene (1991)108:1-6, Pyrococcus woesei (Pwo) and the like. Where the reactionmixture includes both a Family A and Family B polymerase, the Family Apolymerase may be present in the reaction mixture in an amount greaterthan the Family B polymerase, where the difference in activity willusually be at least 10-fold, and more usually at least about 100-fold.Usually the reaction mixture will include four different types of dNTPscorresponding to the four naturally occurring bases present, i.e. dATP,dTTP, dCTP and dGTP. In the subject methods, each dNTP will typically bepresent in an amount ranging from about 10 to 5000 μM, usually fromabout 20 to 1000 μM.

The reaction mixtures prepared in the reverse transcriptase and/oramplification steps of the subject methods may further include anaqueous buffer medium that includes a source of monovalent ions, asource of divalent cations and a buffering agent. Any convenient sourceof monovalent ions, such as KCl, K-acetate, NH₄-acetate, K-glutamate,NH₄Cl, ammonium sulphate, and the like may be employed. The divalentcation may be magnesium, manganese, zinc and the like, where the cationwill typically be magnesium. Any convenient source of magnesium cationmay be employed, including MgCl₂, Mg-acetate, and the like. The amountof Mg²⁺ present in the buffer may range from 0.5 to 10 mM, but willpreferably range from about 3 to 6 mM, and will ideally be at about 5mM. Representative buffering agents or salts that may be present in thebuffer include Tris, Tricine, HEPES, MOPS and the like, where the amountof buffering agent will typically range from about 5 to 150 mM, usuallyfrom about 10 to 100 mM, and more usually from about 20 to 50 mM, wherein certain preferred embodiments the buffering agent will be present inan amount sufficient to provide a pH ranging from about 6.0 to 9.5,where most preferred is pH 7.3 at 72° C. Other agents which may bepresent in the buffer medium include chelating agents, such as EDTA,EGTA and the like.

In preparing the reverse transcriptase, DNA extension or amplificationreaction mixture of the steps of the subject methods, the variousconstituent components may be combined in any convenient order. Forexample, in the amplification reaction the buffer may be combined withprimer, polymerase and then template DNA, or all of the variousconstituent components may be combined at the same time to produce thereaction mixture.

As discussed above, a preferred embodiment of the invention the DNA,e.g. cDNA molecules may be modified by the addition of amplificationdomains to the ends of the nucleic acid molecules, which may involve aligation reaction. A ligation reaction is also required for the in situsynthesis of the capture probe on the array, when the capture probe isimmobilized indirectly on the array surface.

As is known in the art, ligases catalyze the formation of aphosphodiester bond between juxtaposed 3′-hydroxyl and 5′-phosphatetermini of two immediately adjacent nucleic acids. Any convenient ligasemay be employed, where representative ligases of interest include, butare not limited to: Temperature sensitive and thermostable ligases.Temperature sensitive ligases include, but are not limited to,bacteriophage T4 DNA ligase, bacteriophage T7 ligase, and E. coliligase. Thermostable ligases include, but are not limited to, Taqligase, Tth ligase, and Pfu ligase. Thermostable ligase may be obtainedfrom thermophilic or hyperthermophilic organisms, including but notlimited to, prokaryotic, eukaryotic, or archael organisms. Certain RNAligases may also be employed in the methods of the invention.

In this ligation step, a suitable ligase and any reagents that arenecessary and/or desirable are combined with the reaction mixture andmaintained under conditions sufficient for ligation of the relevantoligonucleotides to occur. Ligation reaction conditions are well knownto those of skill in the art. During ligation, the reaction mixture incertain embodiments may be maintained at a temperature ranging fromabout 4° C. to about 50° C., such as from about 20° C. to about 37° C.for a period of time ranging from about 5 seconds to about 16 hours,such as from about 1 minute to about 1 hour. In yet other embodiments,the reaction mixture may be maintained at a temperature ranging fromabout 35° C. to about 45° C., such as from about 37° C. to about 42° C.,e.g., at or about 38° C., 39° C., 40° C. or 41° C., for a period of timeranging from about 5 seconds to about 16 hours, such as from about 1minute to about 1 hour, including from about 2 minutes to about 8 hours.In a representative embodiment, the ligation reaction mixture includes50 mM Tris pH7.5, 10 mM MgCl₂, 10 mM DTT, 1 mM ATP, 25 mg/ml BSA, 0.25units/ml RNase inhibitor, and T4 DNA ligase at 0.125 units/ml. In yetanother representative embodiment, 2.125 mM magnesium ion, 0.2 units/mlRNase inhibitor; and 0.125 units/ml DNA ligase are employed. The amountof adaptor in the reaction will be dependent on the concentration of theDNA, e.g. cDNA in the sample and will generally be present at between10-100 times the molar amount of DNA, e.g. cDNA.

By way of a representative example the method of the invention maycomprise the following steps:

(a) contacting an array with a tissue sample, wherein the arraycomprises a substrate on which multiple species of capture probes aredirectly or indirectly immobilized such that each species occupies adistinct position on the array and is oriented to have a free 3′ end toenable said probe to function as a reverse transcriptase (RT) primer,wherein each species of said capture probe comprises a nucleic acidmolecule with 5′ to 3′:

(i) a positional domain that corresponds to the position of the captureprobe on the array, and

(ii) a capture domain;

such that RNA of the tissue sample hybridises to said capture probes;

(b) imaging the tissue sample on the array;

(c) reverse transcribing the captured mRNA molecules to generate cDNAmolecules;

(d) washing the array to remove residual tissue;

(e) releasing at least part of the cDNA molecules from the surface ofthe array;

(f) performing second strand cDNA synthesis on the released cDNAmolecules;

and

(g) analysing the sequence of (e.g. sequencing) the cDNA molecules.

By way of an alternative representative example the method of theinvention may comprise the following steps:

(a) contacting an array with a tissue sample, wherein the arraycomprises a substrate on which at least two species of capture probesare directly or indirectly immobilized such that each species occupies adistinct position on the array and is oriented to have a free 3′ end toenable said probe to function as a reverse transcriptase (RT) primer,wherein each species of said capture probe comprises a nucleic acidmolecule with 5′ to 3′:

(i) a positional domain that corresponds to the position of the captureprobe on the array, and

(ii) a capture domain;

such that RNA of the tissue sample hybridises to said capture probes;

(b) optionally rehydrating the tissue sample;

(c) reverse transcribing the captured mRNA molecules to generate firststrand cDNA molecules and optionally synthesising second strand cDNAmolecules;

(d) imaging the tissue sample on the array;

(e) washing the array to remove residual tissue;

(f) releasing at least part of the cDNA molecules from the surface ofthe array;

(g) amplifying the released cDNA molecules;

and

(h) analysing the sequence of (e.g. sequencing) the amplified cDNAmolecules.

By way of yet a further representative example the method of theinvention may comprise the following steps:

(a) contacting an array with a tissue sample, wherein the arraycomprises a substrate on which multiple species of capture probes aredirectly or indirectly immobilized such that each species occupies adistinct position on the array and is oriented to have a free 3′ end toenable said probe to function as a reverse transcriptase (RT) primer,wherein each species of said capture probe comprises a nucleic acidmolecule with 5′ to 3′:

-   -   (i) a positional domain that corresponds to the position of the        capture probe on the array, and    -   (ii) a capture domain;

such that RNA of the tissue sample hybridises to said capture probes;

(b) optionally imaging the tissue sample on the array;

(c) reverse transcribing the captured mRNA molecules to generate cDNAmolecules;

(d) optionally imaging the tissue sample on the array if not alreadyperformed as step (b):

(e) washing the array to remove residual tissue;

(f) releasing at least part of the cDNA molecules from the surface ofthe array;

(g) performing second strand cDNA synthesis on the released cDNAmolecules;

(h) amplifying the double stranded cDNA molecules;

(i) optionally purifying the cDNA molecules to remove components thatmay interfere with the sequencing reaction;

and

(j) analysing the sequence of (e.g. sequencing) the amplified cDNAmolecules.

The present invention includes any suitable combination of the steps inthe above described methods. It will be understood that the inventionalso encompasses variations of these methods, for example whereamplification is performed in situ on the array. Also encompassed aremethods which omit the imaging step.

The invention may also be seen to include a method for making orproducing an array (i) for use in capturing mRNA from a tissue samplethat is contacted with said array; or (ii) for use in determining and/oranalysing a (e.g. the partial or global) transcriptome of a tissuesample, said method comprising immobilizing, directly or indirectly,multiple species of capture probe to an array substrate, wherein eachspecies of said capture probe comprises a nucleic acid molecule with 5′to 3′:

-   -   (i) a positional domain that corresponds to the position of the        capture probe on the array; and    -   (ii) a capture domain.

The method of producing an array of the invention may be further definedsuch that each species of capture probe is immobilized as a feature onthe array.

The method of immobilizing the capture probes on the array may beachieved using any suitable means as described herein. Where the captureprobes are immobilized on the array indirectly the capture probe may besynthesized on the array. Said method may comprise any one or more ofthe following steps:

(a) immobilizing directly or indirectly multiple surface probes to anarray substrate, wherein the surface probes comprise:

-   -   (i) a domain capable of hybridizing to part of the capture        domain oligonucleotide (a part not involved in capturing the        nucleic acid, e.g. RNA);    -   (ii) a complementary positional domain; and    -   (iii) a complementary universal domain;

(b) hybridizing to the surface probes immobilized on the array capturedomain oligonucleotides and universal domain oligonucleotides;

(c) extending the universal domain oligonucleotides, by templatedpolymerisation, to generate the positional domain of the capture probe;and

(d) ligating the positional domain to the capture domain oligonucleotideto produce the capture oligonucleotide.

Ligation in step (d) may occur simultaneously with extension in step(c). Thus it need not be carried out in a separate step, although thisis course encompassed if desired.

The features of the array produced by the above method of producing thearray of the invention, may be further defined in accordance with theabove description.

Although the invention is described above with reference to detection oranalysis of RNA, and transcriptome analysis or detection, it will beappreciated that the principles described can be applied analogously tothe detection or analysis of DNA in cells and to genomic studies. Thus,more broadly viewed, the invention can be seen as being generallyapplicable to the detection of nucleic acids in general and in a furthermore particular aspect, as providing methods for the analysis ordetection of DNA. Spatial information may be valuable also in a genomicscontext i.e. detection and/or analysis of a DNA molecule with spatialresolution. This may be achieved by genomic tagging according to thepresent invention. Such localised or spatial detection methods may beuseful for example in the context of studying genomic variations indifferent cells or regions of a tissue, for example comparing normal anddiseased cells or tissues (e.g. normal vs tumour cells or tissues) or instudying genomic changes in disease progression etc. For example, tumourtissues may comprise a heterogeneous population of cells which maydiffer in the genomic variants they contain (e.g. mutations and/or othergenetic aberrations, for example chromosomal rearrangements, chromosomalamplifications/deletions/insertions etc.). The detection of genomicvariations, or different genomic loci, in different cells in a localisedway may be useful in such a context, e.g. to study the spatialdistribution of genomic variations. A principal utility of such a methodwould be in tumour analysis. In the context of the present invention, anarray may be prepared which is designed, for example, to capture thegenome of an entire cell on one feature. Different cells in the tissuesample may thus be compared. Of course the invention is not limited tosuch a design and other variations may be possible, wherein the DNA isdetected in a localised way and the position of the DNA captured on thearray is correlated to a position or location in the tissue sample.

Accordingly, in a more general aspect, the present invention can be seento provide a method for localised detection of nucleic acid in a tissuesample comprising:

(a) providing an array comprising a substrate on which multiple speciesof capture probes are directly or indirectly immobilized such that eachspecies occupies a distinct position on the array and is oriented tohave a free 3′ end to enable said probe to function as a primer for aprimer extension or ligation reaction, wherein each species of saidcapture probe comprises a nucleic acid molecule with 5′ to 3′:

(i) a positional domain that corresponds to the position of the captureprobe on the array, and

(ii) a capture domain;

(b) contacting said array with a tissue sample such that the position ofa capture probe on the array may be correlated with a position in thetissue sample and allowing nucleic acid of the tissue sample tohybridise to the capture domain in said capture probes;

(c) generating DNA molecules from the captured nucleic acid moleculesusing said capture probes as extension or ligation primers, wherein saidextended or ligated DNA molecules are tagged by virtue of the positionaldomain;

(d) optionally generating a complementary strand of said tagged DNAand/or optionally amplifying said tagged DNA;

(e) releasing at least part of the tagged DNA molecules and/or theircomplements or amplicons from the surface of the array, wherein saidpart includes the positional domain or a complement thereof;

(f) directly or indirectly analysing the sequence of (e.g. sequencing)the released DNA molecules.

As described in more detail above, any method of nucleic acid analysismay be used in the analysis step. Typically this may involve sequencing,but it is not necessary to perform an actual sequence determination. Forexample sequence-specific methods of analysis may be used. For example asequence-specific amplification reaction may be performed, for exampleusing primers which are specific for the positional domain and/or for aspecific target sequence, e.g. a particular target DNA to be detected(i.e. corresponding to a particular cDNA/RNA or gene or gene variant orgenomic locus or genomic variant etc.). An exemplary analysis method isa sequence-specific PCR reaction.

The sequence analysis (e.g. sequencing) information obtained in step (f)may be used to obtain spatial information as to the nucleic acid in thesample. In other words the sequence analysis information may provideinformation as to the location of the nucleic acid in the sample. Thisspatial information may be derived from the nature of the sequenceanalysis information obtained e.g. from a sequence determined oridentified, for example it may reveal the presence of a particularnucleic acid molecule which may itself be spatially informative in thecontext of the tissue sample used, and/or the spatial information (e.g.spatial localisation) may be derived from the position of the tissuesample on the array, coupled with the sequence analysis information.However, as described above, spatial information may conveniently beobtained by correlating the sequence analysis data to an image of thetissue sample and this represents one preferred embodiment of theinvention.

Accordingly, in a preferred embodiment the method also includes a stepof:

(g) correlating said sequence analysis information with an image of saidtissue sample, wherein the tissue sample is imaged before or after step(c).

The primer extension reaction referred to in step (a) may be defined asa polymerase-catalysed extension reaction and acts to acquire acomplementary strand of the captured nucleic acid molecule that iscovalently attached to the capture probe, i.e. by synthesising thecomplementary strand utilising the capture probe as a primer and thecaptured nucleic acid as a template. In other words it may be any primerextension reaction carried out by any polymerase enzyme. The nucleicacid may be RNA or it may be DNA. Accordingly the polymerase may be anypolymerase. It may be a reverse transcriptase or it may be a DNApolymerase. The ligation reaction may be carried out by any ligase andacts to secure the complementary strand of the captured nucleic acidmolecule to the capture probe, i.e. wherein the captured nucleic acidmolecule (hybridised to the capture probe) is partially double strandedand the complementary strand is ligated to the capture probe.

One preferred embodiment of such a method is the method described abovefor the determination and/or analysis of a transcriptome, or for thedetection of RNA. In alternative preferred embodiment the detectednucleic acid molecule is DNA. In such an embodiment the inventionprovides a method for localised detection of DNA in a tissue samplecomprising:

(a) providing an array comprising a substrate on which multiple speciesof capture probes are directly or indirectly immobilized such that eachspecies occupies a distinct position on the array and is oriented tohave a free 3′ end to enable said probe to function as a primer for aprimer extension or ligation reaction, wherein each species of saidcapture probe comprises a nucleic acid molecule with 5′ to 3′:

(i) a positional domain that corresponds to the position of the captureprobe on the array, and

(ii) a capture domain;

(b) contacting said array with a tissue sample such that the position ofa capture probe on the array may be correlated with a position in thetissue sample and allowing DNA of the tissue sample to hybridise to thecapture domain in said capture probes;

(c) fragmenting DNA in said tissue sample, wherein said fragmentation iscarried out before, during or after contacting the array with the tissuesample in step (b);

(d) extending said capture probes in a primer extension reaction usingthe captured DNA fragments as templates to generate extended DNAmolecules, or ligating the captured DNA fragments to the capture probesin a ligation reaction to generate ligated DNA molecules, wherein saidextended or ligated DNA molecules are tagged by virtue of the positionaldomain;

(e) optionally generating a complementary strand of said tagged DNAand/or optionally amplifying said tagged DNA;

(f) releasing at least part of the tagged DNA molecules and/or theircomplements and/or amplicons from the surface of the array, wherein saidpart includes the positional domain or a complement thereof;

(g) directly or indirectly analysing the sequence of the released DNAmolecules.

The method may further include a step of:

(h) correlating said sequence analysis information with an image of saidtissue sample, wherein the tissue sample is imaged before or after step(d).

In the context of spatial genomics, where the target nucleic acid is DNAthe inclusion of imaging and image correlation steps may in somecircumstances be preferred.

In embodiments in which DNA is captured, the DNA may be any DNA moleculewhich may occur in a cell. Thus it may be genomic, i.e. nuclear, DNA,mitochondrial DNA or plastid DNA, e.g. chloroplast DNA. In a preferredembodiment, the DNA is genomic DNA.

It will be understood that where fragmentation is carried out after thecontacting in step (b), i.e. after the tissue sample is placed on thearray, fragmentation occurs before the DNA is hybridised to the capturedomain. In other words the DNA fragments are hybridised (or moreparticularly, allowed to hybridise) to the capture domain in saidcapture probes.

Advantageously, but not necessarily, in a particular embodiment of thisaspect of the invention, the DNA fragments of the tissue sample may beprovided with a binding domain to enable or facilitate their capture bythe capture probes on the array. Accordingly, the binding domain iscapable of hybridising to the capture domain of the capture probe. Sucha binding domain may thus be regarded as a complement of the capturedomain (i.e. it may be viewed as a complementary capture domain),although absolute complementarity between the capture and bindingdomains is not required, merely that the binding domain is sufficientlycomplementary to allow a productive hybridisation to take place, i.e.that the DNA fragments in the tissue sample are able to hybridise to thecapture domain of the capture probes. Provision of such a binding domainmay ensure that DNA in the sample does not bind to the capture probesuntil after the fragmentation step. The binding domain may be providedto the DNA fragments by procedures well known in the art, for example byligation of adaptor or linker sequences which may contain the bindingdomain. For example a linker sequence with a protruding end may be used.The binding domain may be present in the single-stranded portion of sucha linker, such that following ligation of the linker to the DNAfragments, the single-stranded portion containing the binding domain isavailable for hybridisation to the capture domain of the capture probes.Alternatively and in a preferred embodiment, the binding domain may beintroduced by using a terminal transferase enzyme to introduce apolynucleotide tail e.g. a homopolymeric tail such as a poly-A domain.This may be carried out using a procedure analogous to that describedabove for introducing a universal domain in the context of the RNAmethods. Thus, in advantageous embodiments a common binding domain maybe introduced. In other words, a binding domain which is common to allthe DNA fragments and which may be used to achieve the capture of thefragments on the array.

Where a tailing reaction is carried out to introduce a (common) bindingdomain, the capture probes on the array may be protected from thetailing reaction, i.e. the capture probes may be blocked or masked asdescribed above. This may be achieved for example by hybridising ablocking oligonucleotide to the capture probe e.g. to the protruding end(e.g. single stranded portion) of the capture probe.

Where the capture domain comprises a poly-T sequence for example, such ablocking oligonucleotide may be a poly-A oligonucleotide. The blockingoligonucleotide may have a blocked 3′ end (i.e. an end incapable ofbeing extended, or tailed). The capture probes may also be protected,i.e. blocked, by chemical and/or enzymatic modifications, as describedin detail above.

Where the binding domain is provided by ligation of a linker asdescribed above, it will be understood that rather than extending thecapture probe to generate a complementary copy of the captured DNAfragment which comprises the positional tag of the capture probe primer,the DNA fragment may be ligated to the 3′ end of the capture probe. Asnoted above ligation requires that the 5′ end to be ligated isphosphorylated. Accordingly, in one embodiment, the 5′ end of the addedlinker, namely the end which is to be ligated to the capture probe (i.e.the non-protruding end of the linker added to the DNA fragments) will bephosphorylated. In such a ligation embodiment, it will accordingly beseen that a linker may be ligated to double stranded DNA fragments, saidlinker having a single stranded protruding 3′ end which contains thebinding domain. Upon contact with the array, the protruding endhybridises to the capture domain of the capture probes. Thishybridisation brings the 3′ end of the capture probe into juxtapositionfor ligation to the 5′ (non-protruding) end of the added linker. Thecapture probe, and hence the positional domain, is thus incorporatedinto the captured DNA fragment by this ligation. Such an embodiment isshown schematically in FIG. 21.

Thus, the method of this aspect of the invention may in a moreparticular embodiment comprise:

(a) providing an array comprising a substrate on which multiple speciesof capture probes are directly or indirectly immobilized such that eachspecies occupies a distinct position on the array and is oriented tohave a free 3′ end to enable said probe to function as a primer for aprimer extension or ligation reaction, wherein each species of saidcapture probe comprises a nucleic acid molecule with 5′ to 3′:

-   -   (i) a positional domain that corresponds to the position of the        capture probe on the array, and    -   (ii) a capture domain;

(b) contacting said array with a tissue sample such that the position ofa capture probe on the array may be correlated with a position in thetissue sample;

(c) fragmenting DNA in said tissue sample, wherein said fragmentation iscarried out before, during or after contacting the array with the tissuesample in step (b);

(d) providing said DNA fragments with a binding domain which is capableof hybridising to said capture domain;

(e) allowing said DNA fragments to hybridise to the capture domain insaid capture probes;

(f) extending said capture probes in a primer extension reaction usingthe captured DNA fragments as templates to generate extended DNAmolecules, or ligating the captured DNA fragments to the capture probesin a ligation reaction to generate ligated DNA molecules, wherein saidextended or ligated DNA molecules are tagged by virtue of the positionaldomain;

(g) optionally generating a complementary strand of said tagged DNAand/or optionally amplifying the tagged DNA;

(h) releasing at least part of the tagged DNA molecules and/or theircomplements and/or amplicons from the surface of the array, wherein saidpart includes the positional domain or a complement thereof;

(i) directly or indirectly analysing the sequence of the released DNAmolecules.

The method may optionally include a further step of

(j) correlating said sequence analysis information with an image of saidtissue sample, wherein the tissue sample is imaged before or after step(f).

In the methods of nucleic acid or DNA detection set out above, theoptional step of generating a complementary copy of the tagged nucleicacid/DNA or of amplifying the tagged DNA, may involve the use of astrand displacing polymerase enzyme, according to the principlesexplained above in the context of the RNA/transcriptomeanalysis/detection methods. Suitable strand displacing polymerases arediscussed above. This is to ensure that the positional domain is copiedinto the complementary copy or amplicon. This will particularly be thecase where the capture probe is immobilized on the array byhybridisation to a surface probe.

However, the use of a strand displacing polymerase in this step is notessential. For example a non-strand displacing polymerase may be usedtogether with ligation of an oligonucleotide which hybridises to thepositional domain. Such a procedure is analogous to that described abovefor the synthesis of capture probes on the array.

In one embodiment, the method of the invention may be used fordetermining and/or analysing all of the genome of a tissue sample e.g.the global genome of a tissue sample. However, the method is not limitedto this and encompasses determining and/or analysing all or part of thegenome. Thus, the method may involve determining and/or analysing a partor subset of the genome, e.g. a partial genome corresponding to a subsetor group of genes or of chromosomes, e.g. a set of particular genes orchromosomes or a particular region or part of the genome, for examplerelated to a particular disease or condition, tissue type etc. Thus, themethod may be used to detect or analyse genomic sequences or genomicloci from tumour tissue as compared to normal tissue, or even withindifferent types of cell in a tissue sample. The presence or absence, orthe distribution or location of different genomic variants or loci indifferent cells, groups of cells, tissues or parts or types of tissuemay be examined.

Viewed from another aspect, the method steps set out above can be seenas providing a method of obtaining spatial information regarding thenucleic acids, e.g. genomic sequences, variants or loci of a tissuesample. Put another way, the methods of the invention may be used forthe labelling (or tagging) of genomes, particularly individual orspatially distributed genomes.

Alternatively viewed, the method of the invention may be seen as amethod for spatial detection of DNA in a tissue sample, or a method fordetecting DNA with spatial resolution, or for localised or spatialdetermination and/or analysis of DNA in a tissue sample. In particular,the method may be used for the localised or spatial detection ordetermination and/or analysis of genes or genomic sequences or genomicvariants or loci (e.g. distribution of genomic variants or loci) in atissue sample. The localised/spatial detection/determination/analysismeans that the DNA may be localised to its native position or locationwithin a cell or tissue in the tissue sample. Thus for example, the DNAmay be localised to a cell or group of cells, or type of cells in thesample, or to particular regions of areas within a tissue sample. Thenative location or position of the DNA (or in other words, the locationor position of the DNA in the tissue sample), e.g. a genomic variant orlocus, may be determined.

It will be seen therefore that the array of the present invention may beused to capture nucleic acid, e.g. DNA of a tissue sample that iscontacted with said array. The array may also be used for determiningand/or analysing a partial or global genome of a tissue sample or forobtaining a spatially defined partial or global genome of a tissuesample. The methods of the invention may thus be considered as methodsof quantifying the spatial distribution of one or more genomic sequences(or variants or loci) in a tissue sample. Expressed another way, themethods of the present invention may be used to detect the spatialdistribution of one or more genomic sequences or genomic variants orgenomic loci in a tissue sample. In yet another way, the methods of thepresent invention may be used to determine simultaneously the locationor distribution of one or more genomic sequences or genomic variants orgenomic loci at one or more positions within a tissue sample. Stillfurther, the methods may be seen as methods for partial or globalanalysis of the nucleic acid e.g. DNA of a tissue sample with spatialresolution e.g. two-dimensional spatial resolution.

The invention can also be seen to provide an array for use in themethods of the invention comprising a substrate on which multiplespecies of capture probes are directly or indirectly immobilized suchthat each species occupies a distinct position on the array and isoriented to have a free 3′ end to enable said probe to function as anextension or ligation primer, wherein each species of said capture probecomprises a nucleic acid molecule with 5′ to 3′:

(i) a positional domain that corresponds to the position of the captureprobe on the array, and

(ii) a capture domain to capture nucleic acid of a tissue sample that iscontacted with said array.

In one aspect the nucleic acid molecule to be captured is DNA. Thecapture domain may be specific to a particular DNA to be detected, or toa particular class or group of DNAs, e.g. by virtue of specifichybridisation to a specific sequence of motif in the target DNA e.g. aconserved sequence, by analogy to the methods described in the contextof RNA detection above. Alternatively the DNA to be captured may beprovided with a binding domain, e.g. a common binding domain asdescribed above, which binding domain may be recognised by the capturedomain of the capture probes. Thus, as noted above, the binding domainmay for example be a homopolymeric sequence e.g. poly-A. Again such abinding domain may be provided according to or analogously to theprinciples and methods described above in relation to the methods forRNA/transcriptome analysis or detection. In such a case, the capturedomain may be complementary to the binding domain introduced into theDNA molecules of the tissue sample.

As also described in the RNA context above, the capture domain may be arandom or degenerate sequence. Thus, DNA may be capturednon-specifically by binding to a random or degenerate capture domain orto a capture domain which comprises at least partially a random ordegenerate sequence.

In a related aspect, the present invention also provides use of anarray, comprising a substrate on which multiple species of capture probeare directly or indirectly immobilized such that each species occupies adistinct position on the array and is oriented to have a free 3′ end toenable said probe to function as a primer for a primer extension orligation reaction, wherein each species of said capture probe comprisesa nucleic acid molecule with 5′ to 3′:

(i) a positional domain that corresponds to the position of the captureprobe on the array; and

(ii) a capture domain;

to capture nucleic acid, e.g. DNA or RNA, of a tissue sample that iscontacted with said array.

Preferably, said use is for localised detection of nucleic acid in atissue sample and further comprises steps of:

(a) generating DNA molecules from the captured nucleic acid moleculesusing said capture probes as extension or ligation primers, wherein saidextended or ligated molecules are tagged by virtue of the positionaldomain;

(b) optionally generating a complementary strand of said tagged nucleicacid and/or amplifying said tagged nucleic acid;

(c) releasing at least part of the tagged DNA molecules and/or theircomplements or amplicons from the surface of the array, wherein saidpart includes the positional domain or a complement thereof;

(d) directly or indirectly analysing the sequence of the released DNAmolecules; and optionally

(e) correlating said sequence analysis information with an image of saidtissue sample, wherein the tissue sample is imaged before or after step(a).

The step of fragmenting DNA in a tissue sample may be carried out usingany desired procedure known in the art. Thus physical methods offragmentation may be used e.g. sonication or ultrasound treatment.Chemical methods are also known. Enzymatic methods of fragmentation mayalso be used, e.g. with endonucleases, for example restriction enzymes.Again methods and enzymes for this are well known in the art.Fragmentation may be done before during or after preparing the tissuesample for placing on an array, e.g. preparing a tissue section.Conveniently, fragmentation may be achieved in the step of fixingtissue. Thus for example, formalin fixation will result in fragmentationof DNA. Other fixatives may produce similar results.

In terms of the detail of preparing and using the arrays in theseaspects of the invention, it will understood that the description anddetail given above in the context of RNA methods applies analogously tothe more general nucleic acid detection and DNA detection methods setout herein. Thus, all aspects and details discussed above applyanalogously. For example, the discussion of reverse transcriptaseprimers and reactions etc may be applied analogously to any aspect ofthe extension primers, polymerase reactions etc. referred to above.Likewise, references and to first and second strand cDNA synthesis maybe applied analogously to the tagged DNA molecule and its complement.Methods of sequence analysis as discussed above may be used.

By way of example, the capture domain may be as described for thecapture probes above. A poly-T or poly-T-containing capture domain maybe used for example where the DNA fragments are provided with a bindingdomain comprising a poly-A sequence.

The capture probes/tagged DNA molecules (i.e. the tagged extended orligated molecules) may be provided with universal domains as describedabove, e.g. for amplification and/or cleavage.

The invention will be further described with reference to the followingnon-limiting Examples with reference to the following drawings in which:

FIG. 1 shows the overall concept using arrayed “barcoded” oligo-dTprobes to capture mRNA from tissue sections for transcriptome analysis.

FIG. 2 shows the a schematic for the visualization of transcriptabundance for corresponding tissue sections.

FIG. 3 shows 3′ to 5′ surface probe composition and synthesis of 5′ to3′ oriented capture probes that are indirectly immobilized at the arraysurface.

FIG. 4 shows a bar chart demonstrating the efficiency of enzymaticcleavage (USER or Rsal) from in-house manufactured arrays and by 99° C.water from Agilent manufactured arrays, as measured by hybridization offluorescently labelled probes to the array surface after probe release.

FIG. 5 shows a fluorescent image captured after 99° C. water mediatedrelease of DNA surface probes from commercial arrays manufactured byAgilent. A fluorescent detection probe was hybridized after hot watertreatment. Top array is an untreated control.

FIG. 6 shows a fixated mouse brain tissue section on top of thetranscriptome capture array post cDNA synthesis and treated withcytoplasmic (top) and nucleic stains (middle), respectively, and mergedimage showing both stains (bottom).

FIG. 7 shows a table that lists the reads sorted for their origin acrossthe low density in-house manufactured DNA-capture array as seen in theschematic representation.

FIG. 8 shows a FFPE mouse brain tissue with nucleic and Map2 specificstains using a barcoded microarray.

FIG. 9 shows FFPE mouse brain olfactory bulb with nucleic stain (white)and visible morphology.

FIG. 10 shows FFPE mouse brain olfactory bulb (approx 2×2 mm) withnucleic stain (white), overlaid with theoretical spotting pattern forlow resolution array.

FIG. 11 shows FFPE mouse brain olfactory bulb (approx 2×2 mm) withnucleic stain (white), overlaid with theoretical spotting pattern formedium-high resolution array.

FIG. 12 shows FFPE mouse brain olfactory bulb zoomed in on glomerulararea (top right of FIG. 9).

FIG. 13 shows the resulting product from a USER release using a randomhexamer primer (R6) coupled to the B_handle (B_R6) during amplification;product as depicted on a bioanalyzer.

FIG. 14 shows the resulting product from a USER release using a randomoctamer primer (R8) coupled to the B_handle (B_R8) during amplification;product as depicted on a bioanalyzer.

FIG. 15 shows the results of an experiment performed on FFPE braintissue covering the whole array. ID5 (left) and ID20 (right) amplifiedwith ID specific and gene specific primers (B2M exon 4) after synthesisand release of cDNA from surface; ID5 and ID20 amplified.

FIG. 16 shows a schematic illustration of the principle of the methoddescribed in Example 4, i.e. use of microarrays with immobilized DNAoligos (capture probes) carrying spatial labeling tag sequences(positional domains). Each feature of oligos of the microarray carriesa 1) a unique labeling tag (positional domain) and 2) a capture sequence(capture domain).

FIG. 17 shows the results of the spatial genomics protocol described inExample 5 carried out with genomic DNA prefragmented to mean size of 200bp. Internal products amplified on array labeled and synthesized DNA.The detected peak is of expected size.

FIG. 18 shows the results of the spatial genomics protocol described inExample 5 carried out with genomic DNA prefragmented to mean size of 700bp. Internal products amplified on array labeled and synthesized DNA.The detected peak is of expected size.

FIG. 19 shows the results of the spatial genomics protocol described inExample 5 carried out with genomic DNA prefragmented to mean size of 200bp. Products amplified with one internal primer and one universalsequence contained in the surface oligo. Amplification carried out onarray labeled and synthesized DNA. The expected product is a smear giventhat the random fragmentation and terminal transferase labeling ofgenomic DNA will generate a very diverse sample pool.

FIG. 20 shows the results of the spatial genomics protocol described inExample 5 carried out with genomic DNA prefragmented to mean size of 700bp. Products amplified with one internal primer and one universalsequence contained in the surface oligo. Amplification carried out onarray labeled and synthesized DNA. The expected product is a smear giventhat the random fragmentation and terminal transferase labeling ofgenomic DNA will generate a very diverse sample pool.

FIG. 21 shows a schematic illustration of the ligation of a linker to aDNA fragment to introduce a binding domain for hybridisation to a poly-Tcapture domain, and subsequent ligation to the capture probe.

FIG. 22 shows the composition of 5′ to 3′ oriented capture probes usedon high-density capture arrays.

FIG. 23 shows the frame of the high-density arrays, which is used toorientate the tissue sample, visualized by hybridization of fluorescentmarker probes.

FIG. 24 shows capture probes cleaved and non-cleaved from high-densityarray, wherein the frame probes are not cleaved since they do notcontain uracil bases. Capture probes were labelled with fluorophorescoupled to poly-A oligonucleotides.

FIG. 25 shows a bioanalyzer image of a prepared sequencing library withtranscripts captured from mouse olfactory bulb.

FIG. 26 shows a Matlab visualization of captured transcripts from totalRNA extracted from mouse olfactory bulb.

FIG. 27 shows Olfr (olfactory receptor) transcripts as visualized acrossthe capture array using Matlab visualization after capture from mouseolfactory bulb tissue.

FIG. 28 shows a pattern of printing for in-house 41-ID-tag microarrays.

FIG. 29 shows a spatial genomics library generated from a A431 specifictranslocation after capture of poly-A tailed genomic fragments oncapture array.

FIG. 30 shows the detection of A431 specific translocation after captureof spiked 10% and 50% poly-A tailed A431 genomic fragments into poly-Atailed U2OS genomic fragments on capture array.

FIG. 31 shows a Matlab visualization of captured ID-tagged transcriptsfrom mouse olfactory bulb tissue on 41-ID-tag in-house arrays overlaidwith the tissue image. For clarity, the specific features on whichparticular genes were identified have been circled.

EXAMPLE 1

Preparation of the Array

The following experiments demonstrate how oligonucleotide probes may beattached to an array substrate by either the 5′ or 3′ end to yield anarray with capture probes capable of hybridizing to mRNA.

Preparation of In-House Printed Microarray with 5′ to 3′ Oriented Probes

20 RNA-capture oligonucleotides with individual tag sequences (Tag 1-20,Table 1 were spotted on glass slides to function as capture probes. Theprobes were synthesized with a 5′-terminus amino linker with a C6spacer. All probes where synthesized by Sigma-Aldrich (St. Louis, Mo.,USA). The RNA-capture probes were suspended at a concentration of 20 μMin 150 mM sodium phosphate, pH 8.5 and were spotted using a NanoplotterNP2.1/E (Gesim, Grosserkmannsdorf, Germany) onto CodeLink™ Activatedmicroarray slides (7.5 cm×2.5 cm; Surmodics, Eden Prairie, Minn., USA).After printing, surface blocking was performed according to themanufacturer's instructions. The probes were printed in 16 identicalarrays on the slide, and each array contained a pre-defined printingpattern. The 16 sub-arrays were separated during hybridization by a16-pad mask (ChipClip™ Schleicher & Schuell BioScience, Keene, N.H.,USA).

TABLE 1 Name Sequence 5′ mod 3′ mod Length Sequences for free 3′capture probes TAP-ID1 UUAAGTACAAATCTCGACTGCCACTCTGAACCT Amino-C6 72TCTCCTTCTCCTTCACCTTTTTTTTTTTTTTTT TTTTVN (SEQ ID NO: 1) Enzymatic recogUUAAGTACAA (SEQ ID NO: 2) 10 Universal amp handle PATCTCGACTGCCACTCTGAA (SEQ ID NO: 3) 20 ID1CCTTCTCCTTCTCCTTCACC (SEQ ID NO: 4) 20 Capture sequenceTTTTTTTTTTTTTTTTTTTTVN (SEQ ID NO: 5) 22 ID1CCTTCTCCTTCTCCTTCACC (SEQ ID NO: 6) 20 ID2CCTTGCTGCTTCTCCTCCTC (SEQ ID NO: 7) 20 ID3ACCTCCTCCGCCTCCTCCTC (SEQ ID NO: 8) 20 ID4GAGACATACCACCAAGAGAC (SEQ ID NO: 9) 20 ID5GTCCTCTATTCCGTCACCAT (SEQ ID NO: 10) 20 ID6GACTGAGCTCGAACATATGG (SEQ ID NO: 11) 20 ID7TGGAGGATTGACACAGAACG (SEQ ID NO: 12) 20 ID8CCAGCCTCTCCATTACATCG (SEQ ID NO: 13) 20 ID9AAGATCTACCAGCCAGCCAG (SEQ ID NO: 14) 20 ID10CGAACTTCCACTGTCTCCTC (SEQ ID NO: 15) 20 ID11TTGCGCCTTCTCCAATACAC (SEQ ID NO: 16) 20 ID12CTCTTCTTAGCATGCCACCT (SEQ ID NO: 17) 20 ID13ACCACTTCTGCATTACCTCC (SEQ ID NO: 18) 20 ID14ACAGCCTCCTCTTCTTCCTT (SEQ ID NO: 19) 20 ID15AATCCTCTCCTTGCCAGTTC (SEQ ID NO: 20) 20 ID16GATGCCTCCACCTGTAGAAC (SEQ ID NO: 21) 20 ID17GAAGGAATGGAGGATATCGC (SEQ ID NO: 22) 20 ID18GATCCAAGGACCATCGACTG (SEQ ID NO: 23) 20 ID19CCACTGGAACCTGACAACCG (SEQ ID NO: 24) 20 ID20CTGCTTCTTCCTGGAACTCA (SEQ ID NO: 25) 20 Sequences for free 5′surface probes and on-chip free 3′ capture probe synthesis Free 5′surface GCGTTCAGAGTGGCAGTCGAGATCACGCGGCAATCA Amino C7 66 probe-ATATCGGACAGATCGGAAGAGCGTAGTGTAG (SEQ ID NO: 26) Free 5′ surfaceGCGTTCAGAGTGGCAGTCGAGATCACGCGGCAATCA Amino C7 66 probe-UTATCGGACGGCTGCTGGTAAATAGAGATCA (SEQ ID NO: 27) Nick GCG  3 LP′TTCAGAGTGGCAGTCGAGATCAC 23 (SEQ ID NO: 28) ID′GCGGCAATCATATCGGAC (SEQ ID NO: 29) 18 A′ 22 bp MutY mismatchAGATCGGAAGAGCGTAGTGTAG 22 (SEQ ID NO: 30) U′ 22 bp MutY mismatchGGCTGCTGGTAAATAGAGATCA (SEQ ID NO: 31)Hybridized sequences for capture probe synthesis Illumina amp handle AACACTCTTTCCCTACACGACGCTCTTCCGATCT 33 (SEQ ID NO: 32)Universa ampl handle U AAGTGTGGAAAGTTGATCGCTATTTACCAGCAG 35CC (SEQ ID NO: 33) Capture_LP_Poly-dTVNGTGATCTCGACTGCCACTCTGAATTTTTTTTTT Phosphorylated 45TTTTTTTTTTVN (SEQ ID NO: 34) Capture_LP_Poly-d24TGTGATCTCGACTGCCACTCTGAATTTTTTTTTT Phosphorylated 47TTTTTTTTTTTTTT (SEQ ID NO: 35)Additional secondary universal amplification handlesIIlumina amp handle B AGACGTGTGCTCTTCCGATCT (SEQ ID NO: 36) 21Universal amp handle X ACGTCTGTGAATAGCCGCAT (SEQ ID NO: 37) 20B_R6 handle (or X) AGACGTGTGCTCTTCCGATCTNNNNNNNN 27(26) (SEQ ID NO: 38)B_R8 handle (or X) AGACGTGTGCTCTTCCGATCTNNNNNNNNNN 29(28)(SEQ ID NO: 39) B_polyTVN (or X) AGACGTGTGCTCTTCCGATCTTTTTTTTTTTTTTTT43(42) TTTTTVN (SEQ ID NO: 40) B_poly24T (or X)AGACGTGTGCTCTTCCGATCTTTTTTTTTTTTTTTT 45(44) TTTTTTTTT (SEQ ID NO: 41)Amplification handle to incorporate A handle into P handle productsA_P handle ACACTCTTTCCCTACACGACGCTCTTCCGATCTATC 53TCGACTGCCACTCTGAA (SEQ ID NO: 42)

Preparation of In-House Printed Microarray with 3′ to 5′ Oriented Probesand Synthesis of 5′ to 3′ Oriented Capture Probes

Printing of surface probe oligonucleotides was performed as in the casewith 5′ to 3′ oriented probes above, with an amino-C7 linker at the 3′end, as shown in Table 1.

To hybridize primers for capture probe synthesis, hybridization solutioncontaining 4×SSC and 0.1% SDS, 2 μM extension primer (the universaldomain oligonucleotide) and 2 μM thread joining primer (the capturedomain oligonucleotide) was incubated for 4 min at 50° C. Meanwhile thein-house array was attached to a ChipClip (Whatman). The array wassubsequently incubated at 50° C. for 30 min at 300 rpm shake with 50 μLof hybridization solution per well.

After incubation, the array was removed from the ChipClip and washedwith the 3 following steps: 1) 50° C. 2×SSC solution with 0.1% SDS for 6min at 300 rpm shake; 2) 0.2×SSC for 1 min at 300 rpm shake; and 3)0.1×SSC for 1 min at 300 rpm shake. The array was then spun dry andplaced back in the ChipClip.

For extension and ligation reaction (to generate the positional domainof the capture probe) 50 μL of enzyme mix containing 10×Ampligasebuffer, 2.5 U AmpliTaq DNA Polymerase Stoffel Fragment (AppliedBiosystems), 10 U Ampligase (Epicentre Biotechnologies), dNTPs 2 mM each(Fermentas) and water, was pipetted to each well. The array wassubsequently incubated at 55° C. for 30 min. After incubation the arraywas washed according to the previously described array washing methodbut the first step has the duration of 10 min instead of 6 min.

The method is depicted in FIG. 3.

Tissue Preparation

The following experiments demonstrate how tissue sample sections may beprepared for use in the methods of the invention.

Preparation of Fresh Frozen Tissue and Sectioning onto Capture ProbeArrays

Fresh non-fixed mouse brain tissue was trimmed if necessary and frozendown in −40° C. cold isopentane and subsequently mounted for sectioningwith a cryostat at 10 μm. A slice of tissue was applied onto eachcapture probe array to be used.

Preparation of Formalin-Fixed Paraffin-Embedded (FFPE) Tissue

Mouse brain tissue was fixed in 4% formalin at 4° C. for 24 h. Afterthat it was incubated as follows: 3× incubation in 70% ethanol for 1hour; 1× incubation in 80% ethanol for 1 hour; 1× incubation in 96%ethanol for 1 hour; 3× incubation in 100% ethanol for 1 hour; and 2×incubation in xylene at room temperature for 1 h.

The dehydrated samples were then incubated in liquid low meltingparaffin 52-54° C. for up to 3 hours, during which the paraffin waschanged once to wash out residual xylene. Finished tissue blocks werethen stored at RT. Sections were then cut at 4 μm in paraffin with amicrotome onto each capture probe array to be used.

The sections were dried at 37° C. on the array slides for 24 hours andstored at RT.

Deparaffinization of FFPE Tissue

Formalin fixed paraffinized mouse brain 10 μm sections attached toCodeLink slides were deparaffinised in xylene twice for: 10 min, 99.5%ethanol for 2 min; 96% ethanol for 2 min; 70% ethanol for 2 min; andwere then air dried.

cDNA Synthesis

The following experiments demonstrate that mRNA captured on the arrayfrom the tissue sample sections may be used as template for cDNAsynthesis.

cDNA Synthesis on Chip

A 16 well mask and Chip Clip slide holder from Whatman was attached to aCodeLink slide. The SuperScript™ III One-step RT-PCR System withPlatinum® Taq DNA Polymerase from Invitrogen was used when performingthe cDNA synthesis. For each reaction 25 μl 2× reaction mix(SuperScript™ III One-step RT-PCR System with Platinum® Taq DNAPolymerase, Invitrogen), 22.5 μl H₂O and 0.5 μl 100×BSA were mixed andheated to 50° C. SuperScript III/Platinum Taq enzyme mix was added tothe reaction mix, 2 μl per reaction, and 50 μl of the reaction mix wasadded to each well on the chip. The chip was incubated at 50° C. for 30min (Thermomixer Comfort, Eppendorf).

The reaction mix was removed from the wells and the slide was washedwith: 2×SSC, 0.1% SDS at 50° C. for 10 min; 0.2×SSC at room temperaturefor 1 min; and 0.1×SSC at room temperature for 1 min. The chip was thenspin dried.

In the case of FFPE tissue sections, the sections could now be stainedand visualized before removal of the tissue, see below section onvisualization.

Visualization

Hybridization of Fluorescent Marker Probes Prior to Staining

Prior to tissue application fluorescent marker probes were hybridized tofeatures comprising marker oligonucleotides printed on the capture probearray. The fluorescent marker probes aid in the orientation of theresulting image after tissue visualization, making it possible tocombine the image with the resulting expression profiles for individualcapture probe “tag” (positional domain) sequences obtained aftersequencing. To hybridize fluorescent probes a hybridization solutioncontaining 4×SSC and 0.1% SDS, 2 μM detection probe (P) was incubatedfor 4 min at 50° C. Meanwhile the in-house array was attached to aChipClip (Whatman). The array was subsequently incubated at 50° C. for30 min at 300 rpm shake with 50 μL of hybridization solution per well.

After incubation, the array was removed from the ChipClip and washedwith the 3 following steps: 1) 50° C. 2×SSC solution with 0.1% SDS for 6min at 300 rpm shake, 2) 0.2×SSC for 1 min at 300 rpm shake and 3)0.1×SSC for 1 min at 300 rpm shake. The array was then spun dry.

General Histological Staining of FFPE Tissue Sections Prior to or PostcDNA Synthesis

FFPE tissue sections immobilized on capture probe arrays were washed andrehydrated after deparaffinization prior to cDNA synthesis as describedpreviously, or washed after cDNA synthesis as described previously. Theyare then treated as follows: incubate for 3 minutes in Hematoxylin;rinse with deionized water; incubate 5 minutes in tap water; rapidly dip8 to 12 times in acid ethanol; rinse 2×1 minute in tap water; rinse 2minutes in deionized water; incubate 30 seconds in Eosin; wash 3×5minutes in 95% ethanol; wash 3×5 minutes in 100% ethanol; wash 3×10minutes in xylene (can be done overnight); place coverslip on slidesusing DPX; dry slides in the hood overnight.

General Immunohistochemistry Staining of a Target Protein in FFPE TissueSections Prior to or Post cDNA Synthesis

FFPE tissue sections immobilized on capture probe arrays were washed andrehydrated after deparaffinization prior to cDNA synthesis as describedpreviously, or washed after cDNA synthesis as described previously. Theywere then treated as follows without being allowed to dry during thewhole staining process: sections were incubated with primary antibody(dilute primary antibody in blocking solution comprising 1×Tris BufferedSaline (50 mM Tris, 150 mM NaCl, pH 7.6), 4% donkey serum and 0.1%triton-x) in a wet chamber overnight at RT; rinse three times with1×TBS; incubate section with matching secondary antibody conjugated to afluorochrome (FITC, Cy3 or Cy5) in a wet chamber at RT for 1 hour. Rinse3× with 1×TBS, remove as much as possible of TBS and mount section withProLong Gold +DAPI (Invitrogen) and analyze with fluorescence microscopeand matching filter sets.

Removal of Residual Tissue

Frozen Tissue

For fresh frozen mouse brain tissue the washing step directly followingcDNA synthesis was enough to remove the tissue completely.

FFPE Tissue

The slides with attached formalin fixed paraffinized mouse brain tissuesections were attached to ChipClip slide holders and 16 well masks(Whatman). For each 150 μl Proteinase K Digest Buffer from the RNeasyFFPE kit (Qiagen), 10 μl Proteinase K Solution (Qiagen) was added. 50 μlof the final mixture was added to each well and the slide was incubatedat 56° C. for 30 min.

Capture Probe (cDNA) Release

Capture Probe Release with Uracil Cleaving USER Enzyme Mixture in PCRBuffer (Covalently Attached Probes)

A 16 well mask and CodeLink slide was attached to the ChipClip holder(Whatman). 50 μl of a mixture containing 1× FastStart High FidelityReaction Buffer with 1.8 mM MgCl2 (Roche), 200 μM dNTPs (New EnglandBiolabs) and 0.1 U/1 μl USER Enzyme (New England Biolabs) was heated to37° C. and was added to each well and incubated at 37° C. for 30 minwith mixing (3 seconds at 300 rpm, 6 seconds at rest) (Thermomixercomfort; Eppendorf). The reaction mixture containing the released cDNAand probes was then recovered from the wells with a pipette.

Capture Probe Release with Uracil Cleaving USER Enzyme Mixture in TdT(Terminal Transferase) Buffer (Covalently Attached Probes)

50 μl of a mixture containing: 1× TdT buffer (20 mM Tris-acetate (pH7.9), 50 mM Potassium Acetate and 10 mM Magnesium Acetate) (New EnglandBiolabs, www.neb.com); 0.1 μg/μl BSA (New England Biolabs); and 0.1 U/μlUSER Enzyme (New England Biolabs) was heated to 37° C. and was added toeach well and incubated at 37° C. for 30 min with mixing (3 seconds at300 rpm, 6 seconds at rest) (Thermomixer comfort; Eppendorf). Thereaction mixture containing the released cDNA and probes was thenrecovered from the wells with a pipette.

Capture Probe Release with Boiling Hot Water (Covalently AttachedProbes)

A 16 well mask and CodeLink slide was attached to the ChipClip holder(Whatman). 50 μl of 99° C. water was pipetted into each well. The 99° C.water was allowed to react for 30 minutes. The reaction mixturecontaining the released cDNA and probes was then recovered from thewells with a pipette.

Capture Probe Release with Heated PCR Buffer (Hybridized In SituSynthesized Capture Probes, i.e. Capture Probes Hybridized to SurfaceProbes)

50 μl of a mixture containing: 1× TdT buffer (20 mM Tris-acetate (pH7.9), 50 mM Potassium Acetate and 10 mM Magnesium Acetate) (New EnglandBiolabs, www.neb.com); 0.1 μg/μl BSA (New England Biolabs); and 0.1 U/μlUSER Enzyme (New England Biolabs) was preheated to 95° C. The mixturewas then added to each well and incubated for 5 minutes at 95° C. withmixing (3 seconds at 300 rpm, 6 seconds at rest) (Thermomixer comfort;Eppendorf). The reaction mixture containing the released probes was thenrecovered from the wells.

Capture Probe Release with Heated TdT (Terminal Transferase) Buffer(Hybridized In Situ Synthesized Capture Probes, i.e. Capture ProbesHybridized to Surface Probes)

50 μl of a mixture containing: 1× TdT buffer (20 mM Tris-acetate (pH7.9), 50 mM Potassium Acetate and 10 mM Magnesium Acetate) (New EnglandBiolabs, www.neb.com); 0.1 μg/μl BSA (New England Biolabs); and 0.1 U/μlUSER Enzyme (New England Biolabs) was preheated to 95° C. The mixturewas then added to each well and incubated for 5 minutes at 95° C. withmixing (3 seconds at 300 rpm, 6 seconds at rest) (Thermomixer comfort;Eppendorf). The reaction mixture containing the released probes was thenrecovered from the wells.

The efficacy of treating the array with the USER enzyme and water heatedto 99° C. can be seen in FIG. 3. Enzymatic cleavage using the USERenzyme and the Rsal enzyme was performed using the “in-house” arraysdescribed above (FIG. 4). Hot water mediated release of DNA surfaceprobes was performed using commercial arrays manufactured by Agilent(see FIG. 5).

Probe Collection and Linker Introduction

The experiments demonstrate that first strand cDNA released from thearray surface may be modified to produce double stranded DNA andsubsequently amplified.

Whole Transcriptome Amplification by the Picoplex Whole GenomeAmplification Kit (Capture Probe Sequences Including Positional Domain(Taq) Sequences not Retained at the Edge of the Resulting dsDNA)

Capture probes were released with uracil cleaving USER enzyme mixture inPCR buffer (covalently attached capture probes) or with heated PCRbuffer (hybridized in situ synthesized capture probes, i.e. captureprobes hybridized to surface probes).

The released cDNA was amplified using the Picoplex (Rubicon Genomics)random primer whole genome amplification method, which was carried outaccording to manufacturers instructions.

Whole Transcriptome Amplification by dA Tailing with TerminalTransferase (TdT) (Capture Probe Sequences Including Positional Domain(Tag) Sequences Retained at the End of the Resulting dsDNA)

Capture probes were released with uracil cleaving USER enzyme mixture inTdT (terminal transferase) buffer (covalently attached capture probes)or with heated TdT (terminal transferase) buffer (hybridized in situsynthesized capture probes, i.e. capture probes hybridized to surfaceprobes).

38 μl of cleavage mixture was placed in a clean 0.2 ml PCR tube. Themixture contained: 1× TdT buffer (20 mM Tris-acetate (pH 7.9), 50 mMPotassium Acetate and 10 mM Magnesium Acetate) (New England Biolabs,www.neb.com), 0.1 μg/μl BSA (New England Biolabs); 0.1 U/μl USER Enzyme(New England Biolabs) (not for heated release); released cDNA (extendedfrom surface probes); and released surface probes. To the PCR tube, 0.5μl RNase H (5 U/μl, final concentration of 0.06 U/μl), 1 μl TdT (20U/μl, final concentration of 0.5 U/μl), and 0.5 μl dATPs (100 mM, finalconcentration of 1.25 mM), were added. For dA tailing, the tube wasincubated in a thermocycler (Applied Biosystems) at 37° C. for 15 minfollowed by an inactivation of TdT at 70° C. for 10 min. After dAtailing, a PCR master mix was prepared. The mix contained: 1× FaststartHiFi PCR Buffer (pH 8.3) with 1.8 mM MgCl₂ (Roche); 0.2 mM of each dNTP(Fermentas); 0.2 μM of each primer, A (complementary to theamplification domain of the capture probe) and B_(dT)24 (Eurofins MWGOperon) (complementary to the poly-A tail to be added to the 3′ end ofthe first cDNA strand); and 0.1 U/μl Faststart HiFi DNA polymerase(Roche). 23 μl of PCR Master mix was placed into nine clean 0.2 ml PCRtubes. 2 μl of dA tailing mixture were added to eight of the tubes,while 2 μl water (RNase/DNase free) was added to the last tube (negativecontrol). PCR amplification was carried out with the following program:Hot start at 95° C. for 2 minutes, second strand synthesis at 50° C. for2 minutes and 72° C. for 3 minutes, amplification with 30 PCR cycles at95′C for 30 seconds, 65° C. for 1 minutes, 72° C. for 3 minutes, and afinal extension at 72′C for 10 minutes.

Post-Reaction Clean and Analysis

Four amplification products were pooled together and were processedthrough a Qiaquick PCR purification column (Qiagen) and eluted into 30μl EB (10 mM Tris-CI, pH 8.5). The product was analyzed on a Bioanalyzer(Agilent). A DNA 1000 kit was used according to manufacturersinstructions.

Sequencing

Illumina Sequencing

dsDNA library for Illumina sequencing using sample indexing was carriedout according to manufacturers instructions. Sequencing was carried outon an HiSeq2000 platform (Illumina).

Bioinformatics

Obtaining Digital Transcriptomic Information from Sequencing Data fromWhole Transcriptome Libraries Amplified Using the dA Tailing TerminalTransferase Approach

The sequencing data was sorted through the FastX toolkit FASTQ Barcodesplitter tool into individual files for the respective capture probepositional domain (tag) sequences. Individually tagged sequencing datawas then analyzed through mapping to the mouse genome with the Tophatmapping tool. The resulting SAM file was processed for transcript countsthrough the HTseq-count software.

Obtaining Digital Transcriptomic Information from Sequencing Data fromWhole Transcriptome Libraries Amplified Using the Picoplex Whole GenomeAmplification Kit Approach

The sequencing data was converted from FASTQ format to FASTA formatusing the FastX toolkit FASTQ-to-FASTA converter. The sequencing readswas aligned to the capture probe positional domain (tag) sequences usingBlastn and the reads with hits better than 1 e⁻⁶ to one of tag sequenceswere sorted out to individual files for each tag sequence respectively.The file of tag sequence reads was then aligned using Blastn to themouse transcriptome, and hits were collected.

Combining Visualization Data and Expression Profiles

The expression profiles for individual capture probe positional domain(tag) sequences are combined with the spatial information obtained fromthe tissue sections through staining. Thereby the transcriptomic datafrom the cellular compartments of the tissue section can be analyzed ina directly comparative fashion, with the availability to distinguishdistinct expression features for different cellular subtypes in a givenstructural context

EXAMPLE 2

FIGS. 8 to 12 show successful visualisation of stained FFPE mouse braintissue (olfactory bulb) sections on top of a bar-coded transcriptomecapture array, according to the general procedure described inExample 1. As compared with the experiment with fresh frozen tissue inExample 1, FIG. 8 shows better morphology with the FFPE tissue. FIGS. 9and 10 show how tissue may be positioned on different types of probedensity arrays.

EXAMPLE 3

Whole Transcriptome Amplification by Random Primer Second StrandSynthesis Followed by Universal Handle Amplification (Capture ProbeSequences Including Tag Sequences Retained at the End of the ResultingdsDNA)

Following capture probe release with uracil cleaving USER enzyme mixturein PCR buffer (covalently attached probes)

OR

Following capture probe release with heated PCR buffer (hybridized insitu synthesized capture probes)

1 μl RNase H (5 U/μl) was added to each of two tubes, finalconcentration of 0.12 U/μl, containing 40 μl 1× Faststart HiFi PCRBuffer (pH 8.3) with 1.8 mM MgCl₂ (Roche,www.roche-applied-science.com), 0.2 mM of each dNTP (Fermentas,www.fermentas.com), 0.1 μg/μl BSA (New England Biolabs, www.neb.com),0.1 U/μl USER Enzyme (New England Biolabs), released cDNA (extended fromsurface probes) and released surface probes. The tubes were incubated at37° C. for 30 min followed by 70° C. for 20 min in a thermo cycler(Applied Biosystems, www.appliedbiosystems.com), 1 μl Klenow Fragment(3′ to 5′ exo minus) (Illumina, www.illumina.com) and 1l handle coupledrandom primer (10 μM) (Eurofins MWG Operon, www.eurofinsdna.com) wasadded to the two tubes (B_R8 (octamer) to one of the tubes and B_R6(hexamer) to the other tube), final concentration of 0.23 μM. The twotubes were incubated at 15° C. for 15 min, 25° C. for 15 min, 37° C. for15 min and finally 75° C. for 20 min in a thermo cycler (AppliedBiosystems). After the incubation, 1 μl of each primer, A_P and B (10μM) (Eurofins MWG Operon), was added to both tubes, final concentrationof 0.22 μM each. 1 μl Faststart HiFi DNA polymerase (5 U/μl) (Roche) wasalso added to both tubes, final concentration of 0.11 U/μl. PCRamplification were carried out in a thermo cycler (Applied Biosystems)with the following program: Hot start at 94° C. for 2 min, followed by50 cycles at 94° C. for 15 seconds, 55° C. for 30 seconds, 68° C. for 1minute, and a final extension at 68° C. for 5 minutes. After theamplification, 40 μl from each of the two tubes were purified withQiaquick PCR purification columns (Qiagen, www.qiagen.com) and elutedinto 30 μl EB (10 mM Tris-CI, pH 8.5). The Purified products wereanalyzed with a Bioanalyzer (Agilent, www.home.agilent.com), DNA 7500kit were used. The results are shown in FIGS. 13 and 14.

This Example demonstrates the use of random hexamer and random octamersecond strand synthesis, followed by amplification to generate thepopulation from the released cDNA molecules.

EXAMPLE 4

Amplification of ID-Specific and Gene Specific Products after cDNASynthesis and Probe Collection

Following capture probe release with uracil cleaving USER enzyme mixturein PCR buffer (covalently attached probes).

The cleaved cDNA was amplified in final reaction volumes of 10 μl. 7 μlcleaved template, 1 μl ID-specific forward primer (2 μM), 1 μlgene-specific reverse primer (2 μM) and 1 μl FastStart High FidelityEnzyme Blend in 1.4× FastStart High Fidelity Reaction Buffer with 1.8 mMMgCl₂ to give a final reaction of 10 μl with 1× FastStart High FidelityReaction Buffer with 1.8 mM MgCl₂ and 1 U FastStart High Fidelity EnzymeBlend. PCR amplification were carried out in a thermo cycler (AppliedBiosystems) with the following program: Hot start at 94° C. for 2 min,followed by 50 cycles at 94° C. for 15 seconds, 55° C. for 30 seconds,68° C. for 1 minute, and a final extension at 68° C. for 5 minutes.

Primer sequences, resulting in a product of approximately 250 bp,

Beta-2 microglobulin (B2M) primer (SEQ ID NO: 43)5′-TGGGGGTGAGAATTGCTAAG-3′ ID-1 primer (SEQ ID NO: 44)5′-CCTTCTCCTTCTCCTTCACC-3′ ID-5 primer (SEQ ID NO: 45)5′-GTCCTCTATTCCGTCACCAT-3′ ID-20 primer (SEQ ID NO: 46)5′-CTGCTTCTTCCTGGAACTCA-3′

The results are shown in FIG. 15. This shows successful amplification ofID-specific and gene-specific products using two different ID primers(i.e. specific for ID tags positioned at different locations on themicroarray and the same gene specific primer from a brain tissuecovering all the probes. Accordingly this experiment establishes thatproducts may be identified by an ID tag-specific or target nucleic acidspecific amplification reaction. It is further established thatdifferent ID tags may be distinguished. A second experiment, with tissuecovering only half of the ID probes (i.e. capture probes) on the arrayresulted in a positive result (PCR product) for spots that were coveredwith tissue.

EXAMPLE 5

Spatial Genomics

Background.

The method has as its purpose to capture DNA molecules from a tissuesample with retained spatial resolution, making it possible to determinefrom what part of the tissue a particular DNA fragment stems.

Method.

The principle of the method is to use microarrays with immobilized DNAoligos (capture probes) carrying spatial labeling tag sequences(positional domains). Each feature of oligos of the microarray carriesa 1) a unique labeling tag (positional domain) and 2) a capture sequence(capture domain). Keeping track of where which labeling tag isgeographically placed on the array surface makes it possible to extractpositional information in two dimensions from each labeling tag.Fragmented genomic DNA is added to the microarray, for instance throughthe addition of a thin section of FFPE treated tissue. The genomic DNAin this tissue section is pre-fragmented due to the fixation treatment.

Once the tissue slice has been placed on the array, a universal tailingreaction is carried out through the use of a terminal transferaseenzyme. The tailing reaction adds polydA tails to the protruding 3′ endsof the genomic DNA fragments in the tissue. The oligos on the surfaceare blocked from tailing by terminal transferase through a hybridizedand 3′ blocked polydA probe.

Following the terminal transferase tailing, the genomic DNA fragmentsare able to hybridize to the spatially tagged oligos in their vicinitythrough the polydA tail meeting the polydT capture sequence on thesurface oligos. After hybridization is completed a strand displacingpolymerase such as Klenow exo- can use the oligo on the surface as aprimer for creation of a new DNA strand complementary to the hybridizedgenomic DNA fragment. The new DNA strand will now also contain thepositional information of the surface oligo's labeling tag.

As a last step the newly generated labeled DNA strands are cleaved fromthe surface through either enzymatic means, denaturation or physicalmeans. The strands are then collected and can be subjected to downstreamamplification of the entire set of strands through introduction ofuniversal handles, amplification of specific amplicons, and/orsequencing.

FIG. 16 is a schematic illustration of this process.

Materials and Methods

Preparation of In-House Printed Microarray with 5′ to 3′ Oriented Probes

20 DNA-capture oligos with individual tag sequences (Table 1) werespotted on glass slides to function as capture probes. The probes weresynthesized with a 5′-terminus amino linker with a C6 spacer. All probeswhere synthesized by Sigma-Aldrich (St. Louis, Mo., USA). TheDNA-capture probes were suspended at a concentration of 20 μM in 150 mMsodium phosphate, pH 8.5 and were spotted using a Nanoplotter NP2.1/E(Gesim, Grosserkmannsdorf, Germany) onto CodeLink™ Activated microarrayslides (7.5 cm×2.5 cm; Surmodics, Eden Prairie, Minn., USA). Afterprinting, surface blocking was performed according to the manufacturer'sinstructions. The probes were printed in 16 identical arrays on theslide, and each array contained a pre-defined printing pattern. The 16sub-arrays were separated during hybridization by a 16-pad mask(ChipClip™ Schleicher & Schuell BioScience, Keene, N.H., USA).

Preparation of In-House Printed Microarray with 3′ to 5′ Oriented Probesand Synthesis of 5′ to 3′ Oriented Capture Probes

Printing of oligos was performed as in the case with 5′ to 3′ orientedprobes above.

To hybridize primers for capture probe synthesis hybridization solutioncontaining 4×SSC and 0.1% SDS, 2 μM extension primer (A_primer) and 2 μMthread joining primer (p_poly_dT) was incubated for 4 min at 50° C.Meanwhile the in-house array was attached to a ChipClip (Whatman). Thearray was subsequently incubated at 50° C. for 30 min at 300 rpm shakewith 50 μL of hybridization solution per well.

After incubation, the array was removed from the ChipClip and washedwith the 3 following steps: 1) 50° C. 2×SSC solution with 0.1% SDS for 6min at 300 rpm shake, 2) 0.2×SSC for 1 min at 300 rpm shake and 3)0.1×SSC for 1 min at 300 rpm shake. The array was then spun dry andplaced back in the ChipClip.

For extension and ligation 50 μL of enzyme mix containing 10×Ampligasebuffer, 2.5 U AmpliTaq DNA Polymerase Stoffel Fragment (AppliedBiosystems), 10 U Ampligase (Epicentre Biotechnologies), dNTPs 2 mM each(Fermentas) and water, is pipetted to each well. The array issubsequently incubated at 55° C. for 30 min. After incubation the arrayis washed according to previously described array washing method but thefirst step has the duration of 10 min instead of 6 min.

Hybridization of polydA Probe for Protection of Surface Oligo CaptureSequences from dA Tailing

To hybridize a 3′-biotin blocked polydA probe for protection of thesurface oligo capture sequences a hybridization solution containing4×SSC and 0.1% SDS, 2 μM 3′bio-polydA was incubated for 4 min at 50° C.Meanwhile the in-house array was attached to a ChipClip (Whatman). Thearray was subsequently incubated at 50° C. for 30 min at 300 rpm shakewith 50 μL of hybridization solution per well.

After incubation, the array was removed from the ChipClip and washedwith the 3 following steps: 1) 50° C. 2×SSC solution with 0.1% SDS for 6min at 300 rpm shake, 2) 0.2×SSC for 1 min at 300 rpm shake and 3)0.1×SSC for 1 min at 300 rpm shake. The array was then spun dry andplaced back in the ChipClip.

Preparation of Formalin-Fixed Paraffin-Embedded (FFPE) Tissue

Mouse brain tissue was fixed in 4% formalin at 4° C. for 24 h. Afterthat it was incubated as follows: 3× incubation in 70% ethanol for 1hour, 1× incubation in 80% ethanol for 1 hour, 1× incubation in 96%ethanol for 1 hour, 3× incubation in 100% ethanol for 1 hour, 2×incubation in xylene at room temperature for 1 h.

The dehydrated samples were then incubated in liquid low meltingparaffin 52-54° C. for up to 3 hours, during which the paraffin inchanged once to wash out residual xylene. Finished tissue blocks werethen stored at RT. Sections were then cut at 4 pin in paraffin with amicrotome onto each capture probe array to be used.

The sections are dried at 37° C. on the array slides for 24 hours andstore at RT.

Deparaffinization of FFPE Tissue

Formalin fixed paraffinized mouse brain 10 μm sections attached toCodeLink slides were deparaffinised in xylene twice for 10 min, 99.5%ethanol for 2 min, 96% ethanol for 2 min, 70% ethanol for 2 min and werethen air dried.

Universal Tailing of Genomic DNA

For dA tailing a 50 μl reaction mixture containing 1× TdT buffer (20 mMTris-acetate (pH 7.9), 50 mM Potassium Acetate and 10 mM MagnesiumAcetate) (New England Biolabs, www.neb.com), 0.1 μg/p BSA (New EnglandBiolabs), 1 μl TdT (20 U/μl) and 0.5 μl dATPs (100 mM) was prepared. Themixture was added to the array surface and the array was incubated in athermo cycler (Applied Biosystems) at 37° C. for 15 min followed by aninactivation of TdT at 70° C. for 10 min. After this the temperature waslowered to 50° C. again to allow for hybridization of dA tailed genomicfragments to the surface oligo capture sequences.

After incubation, the array was removed from the ChipClip and washedwith the 3 following steps: 1) 50° C. 2×SSC solution with 0.1% SDS for 6min at 300 rpm shake, 2) 0.2×SSC for 1 min at 300 rpm shake and 3)0.1×SSC for 1 min at 300 rpm shake. The array was then spun dry.

Extension of Labeled DNA

A 50 μl reaction mixture containing 50 μl of a mixture containing 1×Klenow buffer, 200 μM dNTPs (New England Biolabs) and 1 μl KlenowFragment (3′ to 5′ exo minus) and was heated to 37° C. and was added toeach well and incubated at 37° C. for 30 min with mixing (3 s. 300 rpm,6 s. rest) (Thermomixer comfort; Eppendorf).

After incubation, the array was removed from the ChipClip and washedwith the 3 following steps: 1) 50° C. 2×SSC solution with 0.1% SDS for 6min at 300 rpm shake, 2) 0.2×SSC for 1 min at 300 rpm shake and 3)0.1×SSC for 1 min at 300 rpm shake. The array was then spun dry.

Removal of Residual Tissue

The slides with attached formalin fixed paraffinized mouse brain tissuesections were attached to ChipClip slide holders and 16 well masks(Whatman). For each 150 μl Proteinase K Digest Buffer from the RNeasyFFPE kit (Qiagen) 10 μl Proteinase K Solution (Qiagen) was added. 50 μlof the final mixture was added to each well and the slide was incubatedat 56° C. for 30 min.

Capture Probe Release with Uracil Cleaving USER Enzyme Mixture in PCRBuffer (Covalently Attached Probes)

A 16 well mask and CodeLink slide was attached to the ChipClip holder(Whatman). 50 μl of a mixture containing 1× FastStart High FidelityReaction Buffer with 1.8 mM MgCl₂ (Roche), 200 μM dNTPs (New EnglandBiolabs) and 0.1 U/1 μl USER Enzyme (New England Biolabs) was heated to37° C. and was added to each well and incubated at 37° C. for 30 minwith mixing (3 s. 300 rpm, 6 s. rest) (Thermomixer comfort; Eppendorf).The reaction mixture containing the released cDNA and probes was thenrecovered from the wells with a pipette.

Amplification of ID-Specific and Gene Specific Products after Synthesisof Labelled DNA and Probe Collection

Following capture probe release with uracil cleaving USER enzyme mixturein PCR buffer (covalently attached probes).

The cleaved DNA was amplified in final reaction volumes of 10 μl. 7 μlcleaved template, 1 μl ID-specific forward primer (2 μM), 1 μlgene-specific reverse primer (2 μM) and 1 μl FastStart High FidelityEnzyme Blend in 1.4× FastStart High Fidelity Reaction Buffer with 1.8 mMMgCl₂ to give a final reaction of 10 μl with 1× FastStart High FidelityReaction Buffer with 1.8 mM MgCl₂ and 1 U FastStart High Fidelity EnzymeBlend. PCR amplification were carried out in a thermo cycler (AppliedBiosystems) with the following program: Hot start at 94° C. for 2 min,followed by 50 cycles at 94° C. for 15 seconds, 55° C. for 30 seconds,68° C. for 1 minute, and a final extension at 68° C. for 5 minutes.

Whole Genome Amplification by Random Primer Second Strand SynthesisFollowed by Universal Handle Amplification (Capture Probe SequencesIncluding Tag Sequences Retained at the End of the Resulting dsDNA)

Following capture probe release with uracil cleaving USER enzyme mixturein PCR buffer (covalently attached probes).

A reaction mixture containing 40 μl 1× Faststart HiFi PCR Buffer (pH8.3) with 1.8 mM MgCl₂ (Roche, www.roche-applied-science.com), 0.2 mM ofeach dNTP (Fermentas, www.fermentas.com), 0.1 μg/d BSA (New EnglandBiolabs, www.neb.com), 0.1 U/μl USER Enzyme (New England Biolabs),released DNA (extended from surface probes) and released surface probes.The tubes were incubated at 37° C. for 30 min followed by 70° C. for 20min in a thermo cycler (Applied Biosystems, www.appliedbiosystems.com).1 μl Klenow Fragment (3′ to 5′ exo minus) (Illumina, www.illumina.com)and 1 μl handle coupled random primer (10 μM) (Eurofins MWG Operon,www.eurofinsdna.com) was added to the tube. The tube was incubated at15° C. for 15 min, 25° C. for 15 min, 37° C. for 15 min and finally 75°C. for 20 min in a thermo cycler (Applied Biosystems). After theincubation, 1 μl of each primer, A_P and B (10 μM) (Eurofins MWGOperon), was added to the tube. 1 μl Faststart HiFi DNA polymerase (5U/μl) (Roche) was also added to the tube. PCR amplification were carriedout in a thermo cycler (Applied Biosystems) with the following program:Hot start at 94° C. for 2 min, followed by 50 cycles at 94° C. for 15seconds, 55° C. for 30 seconds, 68° C. for 1 minute, and a finalextension at 68′C for 5 minutes. After the amplification, 40 μl from thetube was purified with Qiaquick PCR purification columns (Qiagen,www.qiagen.com) and eluted into 30 μl EB (10 mM Tris-CI, pH 8.5). ThePurified product was analyzed with a Bioanalyzer (Agilent,www.home.agilent.com), DNA 7500 kit were used.

Visualization

Hybridization of Fluorescent Marker Probes Prior to Staining

Prior to tissue application fluorescent marker probes are hybridized todesignated marker sequences printed on the capture probe array. Thefluorescent marker probes aid in the orientation of the resulting imageafter tissue visualization, making it possible to combine the image withthe resulting expression profiles for individual capture probe tagsequences obtained after sequencing. To hybridize fluorescent probes ahybridization solution containing 4×SSC and 0.1% SDS, 2 μM detectionprobe (P) was incubated for 4 min at 50° C. Meanwhile the in-house arraywas attached to a ChipClip (Whatman). The array was subsequentlyincubated at 50° C. for 30 min at 300 rpm shake with 50 μL ofhybridization solution per well.

After incubation, the array was removed from the ChipClip and washedwith the 3 following steps: 1) 50° C. 2×SSC solution with 0.1% SDS for 6min at 300 rpm shake, 2) 0.2×SSC for 1 min at 300 rpm shake and 3)0.1×SSC for 1 min at 300 rpm shake. The array was then spun dry.

General Histological Staining of FFPE Tissue Sections Prior to or PostSynthesis of Labeled DNA

FFPE tissue sections immobilized on capture probe arrays are washed andrehydrated after deparaffinization prior to synthesis of labeled asdescribed previously, or washed after synthesis of labeled DNA asdescribed previously. They are then treated as follows: incubate for 3minutes in Hematoxylin, rinse with deionized water, incubate 5 minutesin tap water, rapidly dip 8 to 12 times in acid ethanol, rinse 2×1minute in tap water, rinse 2 minutes in deionized water, incubate 30seconds in Eosin, wash 3×5 minutes in 95% ethanol, wash 3×5 minutes in100% ethanol, wash 3×10 minutes in xylene (can be done overnight), placecoverslip on slides using DPX, dry slides in the hood overnight.

General Immunohistochemistry Staining of a Target Protein in FFPE TissueSections Prior to or Post Synthesis of Labeled DNA

FFPE tissue sections immobilized on capture probe arrays are washed andrehydrated after deparaffinization prior to synthesis of labeled DNA asdescribed previously, or washed after synthesis of labeled DNA asdescribed previously. They are then treated as follows without being letto dry during the whole staining process: Dilute primary antibody inblocking solution (1×TBS (Tris Buffered Saline (50 mM Tris, 150 mM NaCl,pH 7.6), 4% donkey serum, 0.1% triton-x), incubate sections with primaryantibody in a wet chamber overnight at RT, rinse 3× with 1×TBS, incubatesection with matching secondary antibody conjugated to a fluorochrome(FITC, Cy3 or Cy5) in a wet chamber at RT for 1 h, Rinse 3× with 1×TBS,remove as much as possible of TBS and mount section with ProLong Gold+DAPI (Invitrogen) and analyze with fluorescence microscope and matchingfilter sets.

EXAMPLE 6

This experiment was conducted following the principles of Example 5, butusing fragmented genomic DNA on the array rather than tissue. Thegenomic DNA was pre-fragmented to a mean size of 200 bp and 700 bprespectively. This experiment shows that the principle works. Fragmentedgenomic DNA is very similar to FFPE tissue.

Amplification of Internal Gene Specific Products after Synthesis ofLabelled DNA and Probe Collection

Following capture probe release with uracil cleaving USER enzyme mixturein PCR buffer (covalently attached probes) containing 1× FastStart HighFidelity Reaction Buffer with 1.8 mM MgCl₂ (Roche), 200 μM dNTPs (NewEngland Biolabs) and 0.1 U/1 μl USER Enzyme (New England Biolabs).

The cleaved DNA was amplified in a final reaction volume of 50 μl. To 47μl cleaved template was added 1 μl ID-specific forward primer (10 μM), 1μl gene-specific reverse primer (10 μM) and 1 μl FastStart High FidelityEnzyme Blend. PCR amplification were carried out in a thermo cycler(Applied Biosystems) with the following program: Hot start at 94° C. for2 min, followed by 50 cycles at 94° C. for 15 seconds, 55° C. for 30seconds, 68° C. for 1 minute, and a final extension at 68′C for 5minutes.

Amplification of Label-Specific and Gene Specific Products afterSynthesis of Labelled DNA and Probe Collection

Following capture probe release with uracil cleaving USER enzyme mixturein PCR buffer (covalently attached probes) containing 1× FastStart HighFidelity Reaction Buffer with 1.8 mM MgCl₂ (Roche), 200 μM dNTPs (NewEngland Biolabs) and 0.1 U/1 μl USER Enzyme (New England Biolabs).

The cleaved DNA was amplified in a final reaction volume of 50 μl. To 47μl cleaved template was added 1 μl label-specific forward primer (10μM), 1 μl gene-specific reverse primer (10 μM) and 1 μl FastStart HighFidelity Enzyme Blend. PCR amplification were carried out in a thermocycler (Applied Biosystems) with the following program: Hot start at 94°C. for 2 min, followed by 50 cycles at 94° C. for 15 seconds, 55° C. for30 seconds, 68° C. for 1 minute, and a final extension at 68′C for 5minutes.

Forward-Genomic DNA Human Primer (SEQ ID NO: 47)5′-GACTGCTCTTTTCACCCATC-3′ Reverse-Genomic DNA Human Primer(SEQ ID NO: 48) 5′-GGAGCTGCTGGTGCAGGG-3′ P-label specific primer(SEQ ID NO: 49) 5′-ATCTCGACTGCCACTCTGAA-3′

The results are shown in FIGS. 17 to 20. The Figures show internalproducts amplified on the array—the detected peaks in FIGS. 17 and 18are of the expected size. This thus demonstrates that genomic DNA may becaptured and amplified. In FIGS. 19 and 20, the expected product is asmear given that the random fragmentation and terminal transferaselabeling of genomic DNA will generate a very diverse sample pool.

EXAMPLE 7

Alternative Synthesis of 5′ to 3′ Oriented Capture Probes UsingPolymerase Extension and Terminal Transferase Tailing

To hybridize primers for capture probe synthesis hybridization solutioncontaining 4×SSC and 0.1% SDS and 2 μM extension primer (A_primer) wasincubated for 4 min at 50° C. Meanwhile the in-house array (seeExample 1) was attached to a ChipClip (Whatman). The array wassubsequently incubated at 50° C. for 30 min at 300 rpm shake with 50 μLof hybridization solution per well.

After incubation, the array was removed from the ChipClip and washedwith the 3 following steps: 1) 50° C. 2×SSC solution with 0.1% SDS for 6min at 300 rpm shake, 2) 0.2×SSC for 1 min at 300 rpm shake and 3)0.1×SSC for 1 min at 300 rpm shake. The array was then spun dry andplaced back in the ChipClip.

1 μl Klenow Fragment (3′ to 5′ exo minus) (Illumina, www.illumina.com)together with 10× Klenow buffer, dNTPs 2 mM each (Fermentas) and water,was mixed into a 50 μl reaction and was pipetted into each well.

The array was incubated at 15° C. for 15 min, 25° C. for 15 min, 37° C.for 15 min and finally 75° C. for 20 min in an Eppendorf Thermomixer.

After incubation, the array was removed from the ChipClip and washedwith the 3 following steps: 1) 50° C. 2×SSC solution with 0.1% SDS for 6min at 300 rpm shake, 2) 0.2×SSC for 1 min at 300 rpm shake and 3)0.1×SSC for 1 min at 300 rpm shake. The array was then spun dry andplaced back in the ChipClip.

For dT tailing a 50 μl reaction mixture containing 1× TdT buffer (20 mMTris-acetate (pH 7.9), 50 mM Potassium Acetate and 10 mM MagnesiumAcetate) (New England Biolabs, www.neb.com), 0.1 μg/μl BSA (New EnglandBiolabs), 0.5 μl RNase H (5 U/μl), 1 μl TdT (20 U/μl) and 0.5 μl dTTPs(100 mM) was prepared. The mixture was added to the array surface andthe array was incubated in a thermo cycler (Applied Biosystems) at 37°C. for 15 min followed by an inactivation of TdT at 70° C. for 10 min.

EXAMPLE 8

Spatial Transcriptomics Using 5′ to 3′ High Probe Density Arrays andFormalin-Fixed Frozen (FF-Frozen) Tissue with USER System Cleavage andAmplification Via Terminal Transferase

Array Preparation

Pre-fabricated high-density microarrays chips were ordered fromRoche-Nimblegen (Madison, Wis., USA). Each capture probe array contained135,000 features of which 132,640 features carried a capture probecomprising a unique ID-tag sequence (positional domain) and a captureregion (capture domain). Each feature was 13×13 μm in size. The captureprobes were composed 5′ to 3′ of a universal domain containing five dUTPbases (a cleavage domain) and a general amplification domain, an ID tag(positional domain) and a capture region (capture domain) (FIG. 22 andTable 2). Each array was also fitted with a frame of marker probes (FIG.23) carrying a generic 30 bp sequence (Table 2) to enable hybridizationof fluorescent probes to help with orientation during arrayvisualization.

Tissue Preparation—Preparation of Formalin-Fixed Frozen Tissue

The animal (mouse) was perfused with 50 ml PBS and 100 ml 4% formalinsolution. After excision of the olfactory bulb, the tissue was put intoa 4% formalin bath for post-fixation for 24 hrs. The tissue was thensucrose treated in 30% sucrose dissolved in PBS for 24 hrs to stabilizemorphology and to remove excess formalin. The tissue was frozen at acontrolled rate down to −40° C. and kept at −20° C. between experiments.Similar preparation of tissue postfixed for 3 hrs or withoutpost-fixation was carried out for a parallel specimen. Perfusion with 2%formalin without post-fixation was also used successfully. Similarly thesucrose treatment step could be omitted. The tissue was mounted into acryostat for sectioning at 10 μm. A slice of tissue was applied ontoeach capture probe array to be used. Optionally for better tissueadherence, the array chip was placed at 50° C. for 15 minutes.

Optional Control—Total RNA Preparation from Sectioned Tissue

Total RNA was extracted from a single tissue section (10 μm) using theRNeasy FFPE kit (Qiagen) according to manufacturers instructions. Thetotal RNA obtained from the tissue section was used in controlexperiments for a comparison with experiments in which the RNA wascaptured on the array directly from the tissue section. Accordingly, inthe case where totalRNA was applied to the array the staining,visualization and degradation of tissue steps were omitted.

On-Chip Reactions

The hybridization of marker probe to the frame probes, reversetranscription, nuclear staining, tissue digestion and probe cleavagereactions were all performed in a 16 well silicone gasket (Arraylt,Sunnyvale, Calif., USA) with a reaction volume of 50 μl per well. Toprevent evaporation, the cassettes were covered with plate sealers (InVitro AB, Stockholm, Sweden).

Optional—Tissue Permeabilization Prior to cDNA Synthesis

For permeabilization using Proteinase K, proteinase K (Qiagen, Hilden,Germany) was diluted to 1 μg/ml in PBS. The solution was added to thewells and the slide incubated at room temperature for 5 minutes,followed by a gradual increase to 80° C. over 10 minutes. The slide waswashed briefly in PBS before the reverse transcription reaction.

Alternatively for permeabilization using microwaves, after tissueattachment, the slide was placed at the bottom of a glass jar containing50 ml 0.2×SSC (Sigma-Aldrich) and was heated in a microwave oven for 1minute at 800 W. Directly after microwave treatment the slide was placedonto a paper tissue and was dried for 30 minutes in a chamber protectedfrom unnecessary air exposure. After drying, the slide was brieflydipped in water (RNase/DNase free) and finally spin-dried by acentrifuge before cDNA synthesis was initiated.

cDNA Synthesis

For the reverse transcription reaction the SuperScript III One-StepRT-PCR System with Platinum Taq (Life Technologies/Invitrogen, Carlsbad,Calif., USA) was used. Reverse transcription reactions contained 1×reaction mix, 1×BSA (New England Biolabs, Ipswich, Mass., USA) and 2 μlSuperScript III RT/Platinum Taq mix in a final volume of 50 μl. Thissolution was heated to 50° C. before application to the tissue sectionsand the reaction was performed at 50° C. for 30 minutes. The reversetranscription solution was subsequently removed from the wells and theslide was allowed to air dry for 2 hours.

Tissue Visualization

After cDNA synthesis, nuclear staining and hybridization of the markerprobe to the frame probes (probes attached to the array substrate toenable orientation of the tissue sample on the array) was donesimultaneously. A solution with DAPI at a concentration of 300 nM andmarker probe at a concentration of 170 nM in PBS was prepared. Thissolution was added to the wells and the slide was incubated at roomtemperature for 5 minutes, followed by brief washing in PBS and spindrying.

Alternatively the marker probe was hybridized to the frame probes priorto placing the tissue on the array. The marker probe was then diluted to170 nM in hybridization buffer (4×SSC, 0.1% SDS). This solution washeated to 50° C. before application to the chip and the hybridizationwas performed at 50° C. for 30 minutes at 300 rpm. After hybridization,the slide was washed in 2×SSC, 0.1% SDS at 50° C. and 300 rpm for 10minutes, 0.2×SSC at 300 rpm for 1 minute and 0.1×SSC at 300 rpm for 1minute. In that case the staining solution after cDNA synthesis onlycontained the nuclear DAPI stain diluted to 300 nM in PBS. The solutionwas applied to the wells and the slide was incubated at room temperaturefor 5 minutes, followed by brief washing in PBS and spin drying.

The sections were microscopically examined with a Zeiss Axio Imager Z2and processed with MetaSystems software.

Tissue Removal

The tissue sections were digested using Proteinase K diluted to 1.25μg/μl in PKD buffer from the RNeasy FFPE Kit (both from Qiagen) at 56°C. for 30 minutes with an interval mix at 300 rpm for 3 seconds, then 6seconds rest. The slide was subsequently washed in 2×SSC, 0.1% SDS at50° C. and 300 rpm for 10 minutes, 0.2×SSC at 300 rpm for 1 minute and0.1×SSC at 300 rpm for 1 minute.

Probe Release

The 16-well Hybridization Cassette with silicone gasket (Arraylt) waspreheated to 37° C. and attached to the Nimblegen slide. A volume of 50μl of cleavage mixture preheated to 37° C., consisting of Lysis bufferat an unknown concentration (Takara), 0.1 U/pd USER Enzyme (NEB) and 0.1μg/d BSA was added to each of wells containing surface immobilized cDNA.After removal of bubbles the slide was sealed and incubated at 37° C.for 30 minutes in a Thermomixer comfort with cycled shaking at 300 rpmfor 3 seconds with 6 seconds rest in between. After the incubation 45 μlcleavage mixture was collected from each of the used wells and placedinto 0.2 ml PCR tubes (FIG. 24).

Library Preparation

Exonuclease Treatment

After cooling the solutions on ice for 2 minutes, Exonuclease I (NEB)was added, to remove unextended cDNA probes, to a final volume of 46.2μl and a final concentration of 0.52 U/μl. The tubes were incubated in athermo cycler (Applied Biosystems) at 37° C. for 30 minutes followed byinactivation of the exonuclease at 80° C. for 25 minutes.

dA-Tailing by Terminal Transferase

After the exonuclease step, 45 μl polyA-tailing mixture, according tomanufacturers instructions consisting of TdT Buffer (Takara), 3 mM dATP(Takara) and manufacturers TdT Enzyme mix (TdT and RNase H) (Takara),was added to each of the samples. The mixtures were incubated in athermocycler at 37° C. for 15 minutes followed by inactivation of TdT at70° C. for 10 minutes.

Second-Strand Synthesis and PCR-Amplification

After dA-tailing, 23 μl PCR master mix was placed into four new 0.2 mlPCR tubes per sample, to each tube 2 μl sample was added as a template.The final PCRs consisted of 1× Ex Taq buffer (Takara), 200 μM of eachdNTP (Takara), 600 nM A_primer (MWG), 600 nM B_dT20VN_primer (MWG) and0.025 U/μl Ex Taq polymerase (Takara)(Table 2). A second cDNA strand wascreated by running one cycle in a thermocycler at 95° C. for 3 minutes,50° C. for 2 minutes and 72° C. for 3 minutes. Then the samples wereamplified by running 20 cycles (for library preparation) or 30 cycles(to confirm the presence of cDNA) at 95° C. for 30 seconds, 67° C. for 1minute and 72° C. for 3 minutes, followed by a final extension at 72° C.for 10 minutes.

Library Cleanup

After amplification, the four PCRs (100 μl) were mixed with 500 μlbinding buffer (Qiagen) and placed in a Qiaquick PCR purification column(Qiagen) and spun for 1 minute at 17,900×g in order to bind theamplified cDNA to the membrane. The membrane was then washed with washbuffer (Qiagen) containing ethanol and finally eluted into 50 μl of 10mM Tris-CI, pH 8.5.

The purified and concentrated sample was further purified andconcentrated by CA-purification (purification by superparamagnetic beadsconjugated to carboxylic acid) with an MBS robot (MagneticBiosolutions). A final PEG concentration of 10% was used in order toremove fragments below 150-200 bp. The amplified cDNA was allowed tobind to the CA-beads (Invitrogen) for 10 min and were then eluted into15 μl of 10 mM Tris-CI, pH 8.5.

Library Quality Analysis

Samples amplified for 30 cycles were analyzed with an AgilentBioanalyzer (Agilent) in order to confirm the presence of an amplifiedcDNA library, the DNA High Sensitivity kit or DNA 1000 kit were useddepending on the amount of material.

Sequencing Library Preparation

Library Indexing

Samples amplified for 20 cycles were used further to prepare sequencinglibraries. An index PCR master mix was prepared for each sample and 23μl was placed into six 0.2 ml tubes. 2 μl of the amplified and purifiedcDNA was added to each of the six PCRs as template making the PCRscontaining 1× Phusion master mix (Fermentas), 500 nM InPE1.0 (Illumina),500 nM Index 1-12 (Illumina), and 0.4 nM InPE2.0 (Illumina). The sampleswere amplified in a thermocycler for 18 cycles at 98° C. for 30 seconds,65° C. for 30 seconds and 72° C. for 1 minute, followed by a finalextension at 72° C. for 5 minutes.

Sequencing Library Cleanup

After amplification, the six PCRs (150 μl) were mixed with 750 μlbinding buffer and placed in a Qiaquick PCR purification column and spunfor 1 minute at 17,900×g in order to bind the amplified cDNA to themembrane (because of the large sample volume (900 μl), the sample wassplit in two (each 450 μl) and was bound in two separate steps). Themembrane was then washed with wash buffer containing ethanol and finallyeluted into 50 μl of 10 mM Tris-CI, pH 8.5.

The purified and concentrated sample was further purified andconcentrated by CA-purification with an MBS robot. A final PEGconcentration of 7.8% was used in order to remove fragments below300-350 bp. The amplified cDNA was allowed to bind to the CA-beads for10 min and were then eluted into 15 μl of 10 mM Tris-CI, pH 8.5. Sampleswere analyzed with an Agilent Bioanalyzer in order to confirm thepresence and size of the finished libraries, the DNA High Sensitivitykit or DNA 1000 kit were used according to manufacturers instructionsdepending on the amount of material (FIG. 25).

Sequencing

The libraries were sequenced on the Illumina Hiseq2000 or Miseqdepending on desired data throughput according to manufacturersinstructions. Optionally for read 2, a custom sequencing primer B_r2 wasused to avoid sequencing through the homopolymeric stretch of 20 T.

Data Analysis

Read 1 was trimmed 42 bases at 5′ end. Read 2 was trimmed 25 bases at 5′end (optionally no bases were trimmed from read 2 if the custom primerwas used). The reads were then mapped with bowtie to the repeat maskedMus musculus 9 genome assembly and the output was formatted in the SAMfile format. Mapped reads were extracted and annotated with UCSC refGenegene annotations. Indexes were retrieved with ‘indexFinder’ (an inhousesoftware for index retrieval). A mongo DB database was then createdcontaining information about all caught transcripts and their respectiveindex position on the chip.

A matlab implementation was connected to the database and allowed forspatial visualization and analysis of the data (FIG. 26).

Optionally the data visualization was overlaid with the microscopicimage using the fluorescently labelled frame probes for exact alignmentand enabling spatial transcriptomic data extraction.

EXAMPLE 9

Spatial Transcriptomics Using 3′ to 5′ High Probe Density Arrays andFFPE Tissue with MutY System Cleavage and Amplification Via TdT

Array Preparation

Pre-fabricated high-density microarrays chips were ordered fromRoche-Nimblegen (Madison, Wis., USA). Each used capture probe arraycontained 72 k features out of which 66,022 contained a unique ID-tagcomplementary sequence. Each feature was 16×16 μm in size. The captureprobes were composed 3′ to 5′ in the same way as the probes used for thein-house printed 3′ to 5′ arrays with the exception to 3 additionalbases being added to the upper (P′) general handle of the probe to makeit a long version of P′, LP′ (Table 2). Each array was also fitted witha frame of probes carrying a generic 30 bp sequence to enablehybridization of fluorescent probes to help with orientation duringarray visualization.

Synthesis of 5′ to 3′ Oriented Capture Probes

The synthesis of 5′ to 3′ oriented capture probes on the high-densityarrays was carried out as in the case with in-house printed arrays, withthe exception that the extension and ligation steps were carried out at55° C. for 15 mins followed by 72° C. for 15 mins. The A-handle probe(Table 2) included an A/G mismatch to allow for subsequent release ofprobes through the MutY enzymatic system described below. The P-probewas replaced by a longer LP version to match the longer probes on thesurface.

Preparation of Formalin-Fixed Paraffin-Embedded Tissue andDeparaffinization

This was carried out as described above in the in-house protocol.

cDNA Synthesis and Staining

cDNA synthesis and staining was carried out as in the protocol for 5′ to3′ oriented high-density Nimblegen arrays with the exception that biotinlabeled dCTPs and dATPs were added to the cDNA synthesis together withthe four regular dNTPs (each was present at 25× times more than thebiotin labeled ones).

Tissue Removal

Tissue removal was carried out in the same way as in the protocol for 5′to 3′ oriented high-density Nimblegen arrays described in Example 8.

Probe Cleavage by MutY

A 16-well Incubation chamber with silicone gasket (ArrayIT) waspreheated to 37° C. and attached to the Codelink slide. A volume of 50μl of cleavage mixture preheated to 37° C., consisting of 1×Endonucelase VIII Buffer (NEB), 10 U/μl MutY (Trevigen), 10 U/μlEndonucelase VIII (NEB), 0.1 μg/μl BSA was added to each of wellscontaining surface immobilized cDNA. After removal of bubbles the slidewas sealed and incubated at 37° C. for 30 minutes in a Thermomixercomfort with cycled shaking at 300 rpm for 3 seconds with 6 seconds restin between. After the incubation, the plate sealer was removed and 40 μlcleavage mixture was collected from each of the used wells and placedinto a PCR plate.

Library Preparation

Biotin-Streptavidin Mediated Library Cleanup

To remove unextended cDNA probes and to change buffer, the samples werepurified by binding the biotin labeled cDNA to streptavidin coatedC1-beads (Invitrogen) and washing the beads with 0.1M NaOH (made fresh).The purification was carried out with an MBS robot (MagneticBiosolutions), the biotin labelled cDNA was allowed to bind to theC1-beads for 10 min and was then eluted into 20 μl of water by heatingthe bead-water solution to 80° C. to break the biotin-streptavidinbinding.

dA-Tailing by Terminal Transferase

After the purification step, 18 μl of each sample was placed into new0.2 ml PCR tubes and mixed with 22 μl of a polyA-tailing master mixleading to a 40 μl reaction mixture according to manufacturersinstructions consisting of lysis buffer (Takara, Cellamp WholeTranscriptome Amplification kit), TdT Buffer (Takara), 1.5 mM dATP(Takara) and TdT Enzyme mix (TdT and RNase H) (Takara). The mixtureswere incubated in a thermocycler at 37° C. for 15 minutes followed byinactivation of TdT at 70° C. for 10 minutes.

Second-Strand Synthesis and PCR-Amplification

After dA-tailing, 23 μl PCR master mix was placed into four new 0.2 mlPCR tubes per sample, to each tube 2 μl sample was added as a template.The final PCRs consisted of 1× Ex Taq buffer (Takara), 200 μM of eachdNTP (Takara), 600 nM A_primer (MWG), 600 nM B_dT20VN_primer (MWG) and0.025 U/μl Ex Taq polymerase (Takara). A second cDNA strand was createdby running one cycle in a thermo cycler at 95° C. for 3 minutes, 50° C.for 2 minutes and 72° C. for 3 minutes. Then the samples were amplifiedby running 20 cycles (for library preparation) or 30 cycles (to confirmthe presence of cDNA) at 95° C. for 30 seconds, 67° C. for 1 minute and72° C. for 3 minutes, followed by a final extension at 72° C. for 10minutes.

Library Cleanup

After amplification, the four PCRs (100 μl) were mixed with 500 μlbinding buffer (Qiagen) and placed in a Qiaquick PCR purification column(Qiagen) and spun for 1 minute at 17,900×g in order to bind theamplified cDNA to the membrane. The membrane was then washed with washbuffer (Qiagen) containing ethanol and finally eluted into 50 μl of 10mM Tris-HCl, pH 8.5.

The purified and concentrated sample was further purified andconcentrated by CA-purification (purification by superparamagnetic beadsconjugated to carboxylic acid) with an MBS robot (MagneticBiosolutions). A final PEG concentration of 10% was used in order toremove fragments below 150-200 bp. The amplified cDNA was allowed tobind to the CA-beads (Invitrogen) for 10 min and were then eluted into15 μl of 10 mM Tris-HCl, pH 8.5.

Second PCR-Amplification

The final PCRs consisted of 1× Ex Taq buffer (Takara), 200 μM of eachdNTP (Takara), 600 nM A_primer (MWG), 600 nM B_primer (MWG) and 0.025U/μl Ex Taq polymerase (Takara). The samples were heated to 95° C. for 3minutes, and then amplified by running 10 cycles at 95° C. for 30seconds, 65° C. for 1 minute and 72° C. for 3 minutes, followed by afinal extension at 72° C. for 10 minutes.

Second Library Cleanup

After amplification, the four PCRs (100 μl) were mixed with 500 μlbinding buffer (Qiagen) and placed in a Qiaquick PCR purification column(Qiagen) and spun for 1 minute at 17,900×g in order to bind theamplified cDNA to the membrane. The membrane was then washed with washbuffer (Qiagen) containing ethanol and finally eluted into 50 μl of 10mM Tris-CI, pH 8.5.

The purified and concentrated sample was further purified andconcentrated by CA-purification (purification by super-paramagneticbeads conjugated to carboxylic acid) with an MBS robot (MagneticBiosolutions). A final PEG concentration of 10% was used in order toremove fragments below 150-200 bp. The amplified cDNA was allowed tobind to the CA-beads (Invitrogen) for 10 min and were then eluted into15 μl of 10 mM Tris-HCl, pH 8.5.

Sequencing Library Preparation

Library Indexing

Samples amplified for 20 cycles were used further to prepare sequencinglibraries. An index PCR master mix was prepared for each sample and 23μl was placed into six 0.2 ml tubes. 2 μl of the amplified and purifiedcDNA was added to each of the six PCRs as template making the PCRscontaining 1× Phusion master mix (Fermentas), 500 nM InPE1.0 (Illumina),500 nM Index 1-12 (Illumina), and 0.4 nM InPE2.0 (Illumina). The sampleswere amplified in a thermo cycler for 18 cycles at 98° C. for 30seconds, 65° C. for 30 seconds and 72° C. for 1 minute, followed by afinal extension at 72° C. for 5 minutes.

Sequencing Library Cleanup

After amplification, the samples was purified and concentrated byCA-purification with an MBS robot. A final PEG concentration of 7.8% wasused in order to remove fragments below 300-350 bp. The amplified cDNAwas allowed to bind to the CA-beads for 10 min and were then eluted into15 μl of 10 mM Tris-HCl, pH 8.5.

10 μl of the amplified and purified samples were placed on a Caliper XTchip and fragments between 480 bp and 720 bp were cut out with theCaliper XT (Caliper). Samples were analyzed with an Agilent Bioanalyzerin order to confirm the presence and size of the finished libraries, theDNA High Sensitivity kit was used.

Sequencing and Data Analysis

Sequencing and Bioinformatic was carried out in the same way as in theprotocol for 5′ to 3′ oriented high-density Nimblegen arrays describedin Example 8. However, in the data analysis, read 1 was not used in themapping of transcripts. Specific Olfr transcripts could be sorted outusing the Matlab visualization tool (FIG. 27).

EXAMPLE 10

Spatial Transcriptomics Using in House Printed 41-Tag Microarray with 5′to 3′ Oriented Probes and Formalin-Fixed Frozen (FF-Frozen) Tissue withPermeabilization Through ProteinaseK or Microwaving with USER SystemCleavage and Amplification Via TdT

Array Preparation

In-house arrays were printed as previously described but with a patternof 41 unique ID-tag probes with the same composition as the probes inthe 5′ to 3′ oriented high-density array in Example 8 (FIG. 28).

All other steps were carried out in the same way as in the protocoldescribed in Example 8.

EXAMPLE 11

Alternative Method for Performing the cDNA Synthesis Step

cDNA synthesis on chip as described above can also be combined withtemplate switching to create a second strand by adding a templateswitching primer to the cDNA synthesis reaction (Table 2). The secondamplification domain is introduced by coupling it to terminal basesadded by the reverse transcriptase at the 3′ end of the first cDNAstrand, and primes the synthesis of the second strand. The library canbe readily amplified directly after release of the double-strandedcomplex from the array surface.

EXAMPLE 12

Spatial Genomics Using in House Printed 41-Tag Microarray with 5′ to 3′Oriented Probes and Fragmented Poly-A Tailed gDNA with USER SystemCleavage and Amplification Via TdT-Tailing or Translocation SpecificPrimers

Array Preparation

In-house arrays were printed using Codelink slides (Surmodics) aspreviously described but with a pattern of 41 unique ID-tag probes withthe same composition as the probes in the 5′ to 3′ oriented high-densityin Example 8.

Total DNA Preparation from Cells

DNA Fragmentation

Genomic DNA (gDNA) was extracted by DNeasy kit (Qiagen) according to themanufacturer's instructions from A431 and U2OS cell lines. The DNA wasfragmented to 500 bp on a Covaris sonicator (Covaris) according tomanufacturer's instructions.

The sample was purified and concentrated by CA-purification(purification by super-paramagnetic beads conjugated to carboxylic acid)with an MBS robot (Magnetic Biosolutions). A final PEG concentration of10% was used in order to remove fragments below 150-200 bp. Thefragmented DNA was allowed to bind to the CA-beads (Invitrogen) for 10min and were then eluted into 15 μl of 10 mM Tris-HCl, pH 8.5.

Optional Control—Spiking of Different Cell Lines

Through spiking of A431 DNA into U2OS DNA different levels of capturesensitivity can be measured, such as from spiking of 1%, 10% or 50% ofA431 DNA.

dA-Tailing by Terminal Transferase

A 45 μl polyA-tailing mixture, according to manufacturer's instructionsconsisting of TdT Buffer (Takara), 3 mM dATP (Takara) and TdT Enzyme mix(TdT and RNase H) (Takara), was added to 0.5 μg of fragmented DNA. Themixtures were incubated in a thermocycler at 37° C. for 30 minutesfollowed by inactivation of TdT at 80° C. for 20 minutes. The dA-tailedfragments were then cleaned through a Qiaquick (Qiagen) column accordingto manufacturer's instructions and the concentration was measured usingthe Qubit system (Invitrogen) according to manufacturer's instructions.

On-Chip Experiments

The hybridization, second strand synthesis and cleavage reactions wereperformed on chip in a 16 well silicone gasket (Arraylt, Sunnyvale,Calif., USA). To prevent evaporation, the cassettes were covered withplate sealers (In Vitro AB, Stockholm, Sweden).

Hybridization

117 ng of DNA was deposited onto a well on a prewarmed array (50° C.) ina total volume of 45 μl consisting of 1×NEB buffer (New England Biolabs)and 1×BSA. The mixture was incubated for 30 mins at 50° C. in aThermomixer Comfort (Eppendorf) fitted with an MTP block at 300 rpmshake.

Second Strand Synthesis

Without removing the hybridization mixture, 15 μl of a Klenow extensionreaction mixture consisting of 1×NEB buffer 1.5 μl Klenow polymerase,and 3.75 μl dNTPs (2 mM each) was added to the well. The reactionmixture was incubated in a Thermomixer Comport (Eppendorf) 37° C. for 30mins without shaking.

The slide was subsequently washed in 2×SSC, 0.1% SDS at 50° C. and 300rpm for 10 minutes, 0.2×SSC at 300 rpm for 1 minute and 0.1×SSC at 300rpm for 1 minute.

Probe Release

A volume of 50 μl of a mixture containing 1× FastStart High FidelityReaction Buffer with 1.8 mM MgCl₂ (Roche), 200 μM dNTPs (New EnglandBiolabs), 1×BSA and 0.1 U/1 μl USER Enzyme (New England Biolabs) washeated to 37° C. and was added to each well and incubated at 37° C. for30 min with mixing (3 seconds at 300 rpm, 6 seconds at rest)(Thermomixer comfort; Eppendorf). The reaction mixture containing thereleased DNA which was then recovered from the wells with a pipette.

Library Preparation

Amplification Reaction

Amplification was carried out in 10 μl reactions consisting of 7.5 μlreleased sample, 1 μl of each primer and 0.5 μl enzyme (Roche, FastStartHiFi PCR system). The reaction was cycled as 94° C. for 2 mins, onecycle of 94° C. 15 sec, 55° C. for 2 mins, 72° C. for 2 mins, 30 cyclesof 94° C. for 15 secs, 65° C. for 30 secs, 72° C. for 90 secs, and afinal elongation at 72° C. for 5 mins.

In the preparation of a library for sequencing the two primers consistedof the surface probe A-handle and either of a specific translocationprimer (for A431) or a specific SNP primer coupled to the B-handle(Table 2).

Library Cleanup

The purified and concentrated sample was further purified andconcentrated by CA-purification (purification by superparamagnetic beadsconjugated to carboxylic acid) with an MBS robot (MagneticBiosolutions). A final PEG concentration of 10% was used in order toremove fragments below 150-200 bp. The amplified DNA was allowed to bindto the CA-beads (Invitrogen) for 10 min and was then eluted into 15 μlof 10 mM Tris-HCl, pH 8.5.

Library Quality Analysis

Samples were analyzed with an Agilent Bioanalyzer (Agilent) in order toconfirm the presence of an amplified DNA library, the DNA HighSensitivity kit or DNA 1000 kit were used depending on the amount ofmaterial.

Library Indexing

Samples amplified for 20 cycles were used further to prepare sequencinglibraries. An index PCR master mix was prepared for each sample and 23μl was placed into six 0.2 ml tubes. 2 μl of the amplified and purifiedcDNA was added to each of the six PCRs as template making the PCRscontaining 1× Phusion master mix (Fermentas), 500 nM InPE1.0 (Illumina),500 nM Index 1-12 (Illumina), and 0.4 nM InPE2.0 (Illumina). The sampleswere amplified in a thermo cycler for 18 cycles at 98° C. for 30seconds, 65° C. for 30 seconds and 72° C. for 1 minute, followed by afinal extension at 72° C. for 5 minutes.

Sequencing Library Cleanup

The purified and concentrated sample was further purified andconcentrated by CA-purification with an MBS robot. A final PEGconcentration of 7.8% was used in order to remove fragments below300-350 bp. The amplified DNA was allowed to bind to the CA-beads for 10min and were then eluted into 15 μl of 10 mM Tris-CI, pH 8.5. Sampleswere analyzed with an Agilent Bioanalyzer in order to confirm thepresence and size of the finished libraries, the DNA High Sensitivitykit or DNA 1000 kit were used according to manufacturers instructionsdepending on the amount of material (FIG. 29).

Sequencing

Sequencing was carried out in the same way as in the protocol for 5′ to3′ oriented high-density Nimblegen arrays described in Example 8.

Data Analysis

Data analysis was carried out to determine the sensitivity of capture ofthe arrayed ID-capture probes. Read 2 was sorted based on its content ofeither of the translocation or SNP primers. These reads were then sortedper their ID contained in Read 1.

Optional Control—Direct Amplification of Cell-Line SpecificTranslocations

This was used to measure the capture sensitivity of spiked cell linesdirectly by PCR. The forward and reverse primers (Table 2) for the A431translocations were used to try and detect the presence of thetranslocation in the second strand copied and released material (FIG.30).

TABLE 2 Oligos used for spatial transcriptomics and spatial genomics 5′to 3′ Example 8 Nimblegen 5′ to 3′ arrays with free 3′ end Array probesProbe1 (SEQ ID NO: 50)UUUUUACACTCTTTCCCTACACGACGCTCTTCCGATCTGTCCGATATGATTGCCGCTTTTTTTTTTTTTTTTTTTTVNProbe2 (SEQ ID NO: 51)UUUUUACACTCTTTCCCTACACGACGCTCTTCCGATCTATGAGCCGGGTTCATCTTTTTTTTTTTTTTTTTTTTTTVNProbe3 (SEQ ID NO: 52)UUUUUACACTCTTTCCCTACACGACGCTCTTCCGATCTTGAGGCACTCTGTTGGGATTTTTTTTTTTTTTTTTTTTVNProbe4 (SEQ ID NO: 53)UUUUUACACTCTTTCCCTACACGACGCTCTTCCGATCTATGATTAGTCGCCATTCGTTTTTTTTTTTTTTTTTTTTVNProbe5 (SEQ ID NO: 54)UUUUUACACTCTTTCCCTACACGACGCTCTTCCGATCTACTTGAGGGTAGATGTTTTTTTTTTTTTTTTTTTTTTTVNProbe6 (SEQ ID NO: 55)UUUUUACACTCTTTCCCTACACGACGCTCTTCCGATCTATGGCCAATACTGTTATCTTTTTTTTTTTTTTTTTTTTVNProbe7 (SEQ ID NO: 56)UUUUUACACTCTTTCCCTACACGACGCTCTTCCGATCTCGCTACCCTGATTCGACCTTTTTTTTTTTTTTTTTTTTVNProbe8 (SEQ ID NO: 57)UUUUUACACTCTTTCCCTACACGACGCTCTTCCGATCTGCCCACTTTCGCCGTAGTTTTTTTTTTTTTTTTTTTTTVNProbe9 (SEQ ID NO: 58)UUUUUACACTCTTTCCCTACACGACGCTCTTCCGATCTAGCAACTTTGAGCAAGATTTTTTTTTTTTTTTTTTTTTVNProbe10 (SEQ ID NO: 59)UUUUUACACTCTTTCCCTACACGACGCTCTTCCGATCTGCCAATTCGGAATTCCGGTTTTTTTTTTTTTTTTTTTTVNProbe11 (SEQ ID NO: 60)UUUUUACACTCTTTCCCTACACGACGCTCTTCCGATCTTCGCCCAAGGTAATACATTTTTTTTTTTTTTTTTTTTTVNProbe12 (SEQ ID NO: 61)UUUUUACACTCTTTCCCTACACGACGCTCTTCCGATCTTCGCATTTCCTATTCGAGTTTTTTTTTTTTTTTTTTTTVNProbe13 (SEQ ID NO: 62)UUUUUACACTCTTTCCCTACACGACGCTCTTCCGATCTTTGCTAAATCTAACCGCCTTTTTTTTTTTTTTTTTTTTVNProbe14 (SEQ ID NO: 63)UUUUUACACTCTTTCCCTACACGACGCTCTTCCGATCTGGAATTAAATTCTGATGGTTTTTTTTTTTTTTTTTTTTVNProbe15 (SEQ ID NO: 64)UUUUUACACTCTTTCCCTACACGACGCTCTTCCGATCTCATTACATAGGTGCTAAGTTTTTTTTTTTTTTTTTTTTVNProbe16 (SEQ ID NO: 65)UUUUUACACTCTTTCCCTACACGACGCTCTTCCGATCTATTGACTTGCGCTCGCACTTTTTTTTTTTTTTTTTTTTVNProbe17 (SEQ ID NO: 66)UUUUUACACTCTTTCCCTACACGACGCTCTTCCGATCTATAGTATCTCCCAAGTTCTTTTTTTTTTTTTTTTTTTTVNProbe18 (SEQ ID NO: 67)UUUUUACACTCTTTCCCTACACGACGCTCTTCCGATCTGTGCGCCTGTAATCCGCATTTTTTTTTTTTTTTTTTTTVNProbe19 (SEQ ID NO: 68)UUUUUACACTCTTTCCCTACACGACGCTCTTCCGATCTGCGCCACTCTTTAGGTAGTTTTTTTTTTTTTTTTTTTTVNProbe20 (SEQ ID NO: 69)UUUUUACACTCTTTCCCTACACGACGCTCTTCCGATCTTATGCAAGTGATTGGCTTTTTTTTTTTTTTTTTTTTTTVNProbe21 (SEQ ID NO: 70)UUUUUACACTCTTTCCCTACACGACGCTCTTCCGATCTCCAAGCCACGTTTATACGTTTTTTTTTTTTTTTTTTTTVNProbe22 (SEQ ID NO: 71)UUUUUACACTCTTTCCCTACACGACGCTCTTCCGATCTACCTGATTGCTGTATAACTTTTTTTTTTTTTTTTTTTTVNProbe23 (SEQ ID NO: 72)UUUUUACACTCTTTCCCTACACGACGCTCTTCCGATCTCAGCGCATCTATCCTCTATTTTTTTTTTTTTTTTTTTTVNProbe24 (SEQ ID NO: 73)UUUUUACACTCTTTCCCTACACGACGCTCTTCCGATCTTCCACGCGTAGGACTAGTTTTTTTTTTTTTTTTTTTTTVNProbe25 (SEQ ID NO: 74)UUUUUACACTCTTTCCCTACACGACGCTCTTCCGATCTCGACTAAGTATGTAGCGCTTTTTTTTTTTTTTTTTTTTVNFrame probe Layout1 (SEQ ID NO: 75) AAATTTCGTCTGCTATCGCGCTTCTGTACCFluorescent marker probe PS_1 (SEQ ID NO: 76) GGTACAGAAGCGCGATAGCAG-Cy3Second strand synthesis and first PCR Amplification handlesA_primer (SEQ ID NO: 77) ACACTCTTTCCCTACACGACGCTCTTCCGATCTB_dt20VN_primer AGACGTGTGCTCTTCCGATCTTTTTTTTTTTTTTTTTTTTTVN(SEQ ID NO: 78) Custom sequencing primer B_r2 (SEQ ID NO: 79)TCA GAC GTG TGC TCT TCC GAT CTT TTT TTT TTT TTT TTT TTT T Example 9Nimblegen 3′ to 5′ arrays with free 5′ end Array probesProbe1 (SEQ ID NO: 80)GCGTTCAGAGTGGCAGTCGAGATCACGCGGCAATCATATCGGACAGATCGGAAGAGCGTAGTGTAGProbe2 (SEQ ID NO: 81)GCGTTCAGAGTGGCAGTCGAGATCACAAGATGAACCCGGCTCATAGATCGGAAGAGCGTAGTGTAGProbe3 (SEQ ID NO: 82)GCGTTCAGAGTGGCAGTCGAGATCACTCCCAACAGAGTGCCTCAAGATCGGAAGAGCGTAGTGTAGProbe4 (SEQ ID NO: 83)GCGTTCAGAGTGGCAGTCGAGATCACCGAATGGCGACTAATCATAGATCGGAAGAGCGTAGTGTAGProbe5 (SEQ ID NO: 84)GCGTTCAGAGTGGCAGTCGAGATCACAAACATCTACCCTCAAGTAGATCGGAAGAGCGTAGTGTAGProbe6 (SEQ ID NO: 85)GCGTTCAGAGTGGCAGTCGAGATCACGATAACAGTATTGGCCATAGATCGGAAGAGCGTAGTGTAGProbe7 (SEQ ID NO: 86)GCGTTCAGAGTGGCAGTCGAGATCACGGTCGAATCAGGGTAGCGAGATCGGAAGAGCGTAGTGTAGProbe8 (SEQ ID NO: 87)GCGTTCAGAGTGGCAGTCGAGATCACACTACGGCGAAAGTGGGCAGATCGGAAGAGCGTAGTGTAGProbe9 (SEQ ID NO: 88)GCGTTCAGAGTGGCAGTCGAGATCACATCTTGCTCAAAGTTGCTAGATCGGAAGAGCGTAGTGTAGProbe10 (SEQ ID NO: 89)GCGTTCAGAGTGGCAGTCGAGATCACCCGGAATTCCGAATTGGCAGATCGGAAGAGCGTAGTGTAGProbe11 (SEQ ID NO: 90)GCGTTCAGAGTGGCAGTCGAGATCACATGTATTACCTTGGGCGAAGATCGGAAGAGCGTAGTGTAGProbe12 (SEQ ID NO: 91)GCGTTCAGAGTGGCAGTCGAGATCACCTCGAATAGGAAATGCGAAGATCGGAAGAGCGTAGTGTAGProbe13 (SEQ ID NO: 92)GCGTTCAGAGTGGCAGTCGAGATCACGGCGGTTAGATTTAGCAAAGATCGGAAGAGCGTAGTGTAGProbe14 (SEQ ID NO: 93)GCGTTCAGAGTGGCAGTCGAGATCACCCATCAGAATTTAATTCCAGATCGGAAGAGCGTAGTGTAGProbe15 (SEQ ID NO: 94)GCGTTCAGAGTGGCAGTCGAGATCACCTTAGCACCTATGTAATGAGATCGGAAGAGCGTAGTGTAGProbe16 (SEQ ID NO: 95)GCGTTCAGAGTGGCAGTCGAGATCACGTGCGAGCGCAAGTCAATAGATCGGAAGAGCGTAGTGTAGProbe17 (SEQ ID NO: 96)GCGTTCAGAGTGGCAGTCGAGATCACGAACTTGGGAGATACTATAGATCGGAAGAGCGTAGTGTAGProbe18 (SEQ ID NO: 97)GCGTTCAGAGTGGCAGTCGAGATCACTGCGGATTACAGGCGCACAGATCGGAAGAGCGTAGTGTAGProbe19 (SEQ ID NO: 98)GCGTTCAGAGTGGCAGTCGAGATCACCTACCTAAAGAGTGGCGCAGATCGGAAGAGCGTAGTGTAGProbe20 (SEQ ID NO: 99)GCGTTCAGAGTGGCAGTCGAGATCACAAGCCAATCACTTGCATAAGATCGGAAGAGCGTAGTGTAGProbe21 (SEQ ID NO: 100)GCGTTCAGAGTGGCAGTCGAGATCACCGTATAAACGTGGCTTGGAGATCGGAAGAGCGTAGTGTAGProbe22 (SEQ ID NO: 101)GCGTTCAGAGTGGCAGTCGAGATCACGTTATACAGCAATCAGGTAGATCGGAAGAGCGTAGTGTAGProbe23 (SEQ ID NO: 102)GCGTTCAGAGTGGCAGTCGAGATCACTAGAGGATAGATGCGCTGAGATCGGAAGAGCGTAGTGTAGProbe24 (SEQ ID NO: 103)GCGTTCAGAGTGGCAGTCGAGATCACACTAGTCCTACGCGTGGAAGATCGGAAGAGCGTAGTGTAGProbe25 (SEQ ID NO: 104)GCGTTCAGAGTGGCAGTCGAGATCACGCGCTACATACTTAGTCGAGATCGGAAGAGCGTAGTGTAGFrame probe Layout1 (SEQ ID NO: 105) AAATTTCGTCTGCTATCGCGCTTCTGTACCCapture probe LP_Poly-dTVN GTGATCTCGACTGCCACTCTGAATTTTTTTTTTTTTTTTTTTTVN(SEQ ID NO: 106) Amplification handle probe A-handle (SEQ ID NO: 107)ACACTCTTTCCCTACACGACGCTCTTCCGATCTSecond strand synthesis and first PCR amplification handlesA_primer (SEQ ID NO: 108) ACACTCTTTCCCTACACGACGCTCTTCCGATCTB_dt20VN_primer AGACGTGTGCTCTTCCGATCTTTTTTTTTTTTTTTTVN (SEQ ID NO: 109)Second PCR A_primer (SEQ ID NO: 110) ACACTCTTTCCCTACACGACGCTCTTCCGATCTB_primer (SEQ ID NO: 111) GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT Example 11Template switching Templateswitch_longBGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTATrGrGrG (SEQ ID NO: 112) Example 12Spatial genomics A_primer (SEQ ID NO: 113)ACACTCTTTCCCTACACGACGCTCTTCCGATCT B_A431_Chr2+2_FW_AAGACGTGTGCTCTTCCGATCTTGGCTGCCTGAGGCAATG (SEQ ID NO: 114)B_A431_Chr2+2_RE_A AGACGTGTGCTCTTCCGATCTCTCGCTAACAAGCAGAGAGAAC(SEQ ID NO: 115) B_A431_Chr3+7_FW_BAGACGTGTGCTCTTCCGATCTTGAGAACAAGGGGGAAGAG (SEQ ID NO: 116)B_A431_Chr3+7_RE_B AGACGTGTGCTCTTCCGATCTCGGTGAAACAAGCAGGTAAC(SEQ ID NO: 117) B_NT_1_FW (SEQ ID NO: 118)AGACGTGTGCTCTTCCGATCTCATTCCCACACTCATCACAC B_NT_1_RE (SEQ ID NO: 119)AGACGTGTGCTCTTCCGATCTTCACACTGGAGAAAGACCC B_NT_2_FW (SEQ ID NO: 120)AGACGTGTGCTCTTCCGATCTGGGGTTCAGAGTGATTTTTCAG B_NT_2_RE (SEQ ID NO: 121)AGACGTGTGCTCTTCCGATCTTCCGTTTTCTTTCAGTGCC

What is claimed is:
 1. A method for localized detection of RNA in atissue sample comprising cells, said method comprising: (a) providing anarray comprising a plurality of features on a substrate, each featurecomprising a different capture probe immobilized thereon such that thecapture probe has a free 3′ end, each feature occupying a distinctposition on the array and having an area of less than about 1 mm², eachcapture probe consisting of a nucleic acid molecule comprising thefollowing domains oriented 5′ to 3′: (i) a positional domain comprisinga nucleotide sequence unique to a particular feature; and (ii) a capturedomain comprising a nucleotide sequence complementary to the RNA to bedetected; (b) contacting said array with the tissue sample comprisingcells such that the tissue sample contacts a plurality of the featuresat their distinct positions on the array; (c) hybridizing the RNApresent in the tissue sample comprising cells that are complementary tothe capture sequences of the capture probes immobilized on the features,such that the RNA is captured by the capture domain of the captureprobes in the features; (d) generating cDNA molecules from the capturedRNA, by extending the capture probes enzymatically using the capturedRNA as an extension template, such that the cDNA molecules comprise thenucleotide sequences of the positional domains; (e) releasing at leastpart of the cDNA molecules from the features of the surface of thearray, and (f) identifying nucleotide sequences of the positional domainor sequences complementary to the nucleotide sequences of the positionaldomain present in the released cDNA molecules, wherein the presence ofthe nucleotide sequence of the positional domain unique to a givenparticular feature or the sequence complementary to the nucleotidesequence of the positional domain unique to said particular featureindicates that the released cDNA molecule was obtained from RNA presentin the tissue sample comprising cells at the distinct position where thetissue sample comprising cells contacted said particular feature.
 2. Themethod of claim 1, further comprising a step of generating acomplementary strand of the cDNA molecules.
 3. The method of claim 2,wherein the step of releasing at least part of the cDNA molecules fromthe surface of the array comprises releasing the complementary strandsof the cDNA molecules by denaturation.
 4. The method of claim 3 furthercomprising a step of amplifying the released complementary strands ofthe cDNA molecules.
 5. The method of claim 1 further comprising a stepof amplifying the cDNA molecules such that the amplified cDNA moleculescomprise nucleotide sequences of the positional domains and nucleotidesequences complementary to the positional sequences.
 6. The method ofclaim 5, wherein said step of amplifying the cDNA molecules functions asthe step of releasing at least part of the cDNA molecules from thefeatures of the surface of the array.
 7. The method of claim 1, whereineach capture probe consisting of a nucleic acid molecule comprises thefollowing domains oriented 5′ to 3′: (i) a cleavage domain; (ii) apositional domain comprising a nucleotide sequence unique to aparticular feature; and (iii) a capture domain comprising a nucleotidesequence complementary to the RNA to be detected, and wherein the stepof releasing at least part of the cDNA molecules from the features ofthe surface of the array comprises cleaving the cleavage domain.
 8. Themethod of claim 7, wherein the step of cleaving the cleavage domaincomprises cleaving the cleavage domain with a cleavage enzyme thatrecognizes a nucleotide sequence in the cleavage domain and cleaves thecDNA molecules at a position that is 5′ to the positional domain.
 9. Themethod of claim 1, wherein the RNA is selected from the list consistingof mRNA, tRNA, rRNA, viral RNA, small nuclear RNA (snRNA), smallnucleolar RNA (snoRNA), microRNA (miRNA), small interfering RNA (siRNA),piwi-interacting RNA (piRNA), ribozymal RNA, antisense RNA andnon-coding RNA.
 10. The method of claim 9, wherein the RNA is mRNA andthe capture domain is designed for the selective capture of mRNA. 11.The method of claim 10, wherein the capture domain that is designed forthe selective capture of mRNA hybridizes to the poly-A tail of mRNA. 12.The method of claim 10, wherein the domain that is designed for theselective capture of mRNA comprises a poly-T sequence.
 13. The method ofclaim 1, wherein the capture domain comprises a random hexamer sequence.14. The method of claim 1, wherein step (f) comprises sequencing thereleased cDNA molecules.
 15. The method of claim 14 further comprising astep of correlating the sequence analysis information obtained in step(f) with an image of said tissue sample comprising cells, wherein themethod includes a step of imaging the tissue sample comprising cellsafter step (b).
 16. The method of claim 1, further comprisingdetermining which genes are expressed at a particular distinct locationof the tissue sample comprising cells by a method comprising determiningthe sequences of the released cDNA molecules comprising the samenucleotide sequence of a positional domain or sequence complementary thenucleotide sequence of a positional domain.
 17. The method of claim 1,further comprising determining where a particular gene is expressed inthe tissue sample comprising cells by a method comprising identifyingthe released cDNA molecules comprising a sequence associated with saidparticular gene and determining which nucleotide sequences of thepositional domains or sequences complementary the nucleotide sequencesof the positional domains are attached thereto.
 18. The method of claim1, further comprising correlating the nucleotide sequence of apositional domain unique to a given particular feature or the sequencecomplementary to the nucleotide sequence of a positional domain uniqueto said particular feature present in the released cDNA molecules to aposition in the tissue sample.
 19. The method of claim 1, wherein thetissue sample comprising cells is a tissue section or a cell suspension.20. The method of claim 1, wherein capture probes are immobilized on thesubstrate by a chemical linker.
 21. The method of claim 1, wherein thearray is a bead array and the capture probes are immobilized on thebeads of the array.